[tahoe-dev] So how do *you* manage your keys, then? Re: cleversafe says: 3 Reasons Why Encryption isOverrated
Jason Resch
jresch at cleversafe.com
Tue Aug 18 00:03:42 PDT 2009
Zooko Wilcox-O'Hearn wrote:
>
> On Monday,2009-08-10, at 11:56 , Jason Resch wrote:
>
> > You have stated how Cleversafe manages the key but not provided any
> > details regarding how Tahoe-LAFS manages the decryption key?
>
> I think this is potentially Tahoe-LAFS's best contribution to the
> state of the art, so I hope many of the readers of these lists will
> think carefully about the following.
>
> The design of Tahoe-LAFS is to separate key management (== access
> control) from data storage, and to make key management simple and
> flexible.
>
> First, we boil down the key management problem for a given file or
> directory to a single key, which is short (less than 100 bytes) so
> that it is easier to manage. This key suffices for both decryption
> and integrity-checking.
>
Zooko,
I hope to not come off as overly critical. I believe Tahoe has
developed many interesting features and ideas, and its approach for
encryption is better than many if not most other systems I am familiar
with. The problem I am about to point out is an almost universal
problem among cryptosystems, so forgive me if I sound like I am picking
on Tahoe-LAFS, that is not my intention.
On the topic of key management it seems rather than addressing the
problem, Tahoe-LAFS offloads it to the end-user. In a sense, Tahoe-LAFS
is like a super-compression algorithm: Send a file of any size through
Tahoe-LAFS and get back a much smaller string of data. Like a
compression algorithm, the end-user is still responsible for reliably
and securely storing the result. The effect is the security and
reliability of the stored data can never exceed that of the system the
user stores their identifiers on. Tahoe-LAFS sacrifices perhaps the
greatest benefit of dispersed storage, the ultra-high reliability.
Being small, identifiers are easy to replicate and therefore are easy to
store reliably. However, having many copies of these highly
confidential identifiers in different locations or on different media
makes it much more likely that they will be compromised, reducing the
confidentiality of the data below that of keeping only a single
instance. Users of Tahoe-LAFS are faced with a difficult choice:
1. Keep data highly confidential, by not making copies of the identifiers
2. Keep data highly reliable, by replicating the identifiers
Luckily there is a third option, which can achieve the best of both worlds:
3. Use a secret sharing scheme to attain reliable and confidential key
storage
While secret sharing schemes are the ideal method for key storage,
Tahoe-LAFS doesn't provide this feature and given its current design,
cannot support it. To have a secret sharing system, there must be some
way to authenticate those who request shares. In the case of
Tahoe-LAFS, as I understand it, the authentication (or access control)
key is the very secret one would try to secure using the secret sharing
scheme.
The only way to achieve the ultra-high reliability which information
dispersal allows is to make sure keys are stored as reliably as the
data. A corollary of this is that the authentication credentials used
to access the keys must be something which is revocable and replaceable
when lost, otherwise there is an infinite regress of how to protect the
credentials from loss. Cleversafe has adopted this approach, using
replaceable authentication credentials and a secret sharing scheme to
store both the key and data. I should note that nothing in our approach
precludes someone from encrypting their data and taking responsibility
for the management of that key, but doing so sets an upper-bound on
reliability, equal to that of the key management system.
>
> Second, we make a separate, independent key for every single file or
> directory. This means that access control decisions such as "Should
> I share this file with my friend?" don't have to be linked to access
> control of other files or directories. (Although they *can* be
> bundled together if desired.)
>
There is flexibility in having a separate key for each file but it also
means one needs to take the time to make backups of the newly generated
keys following each upload session. Although I suppose if one restricts
uploads to children of directories whose key is already backed up this
step can be avoided. Is that correct?
>
>
> Third, we *embed the key directly into the identifier of the file*.
> This part is important. You know how in a filesystem, whether local
> or distributed, files have a unique "file handle" or identifier? In
> a traditional Unix filesystem it is the inode number. Like a Unix
> directory, a Tahoe-LAFS directory consists of a map from the name of
> each child to the file handle to that child. The critical decision
> here is to embed the crypto key directly into that handle. The
> result is that when some human or some program wants to give anothe
> human or program access to a Tahoe-LAFS file or directory, it does so
> by giving the file handle. This single value serves for access
> control (you can't decrypt the file if you don't have it),
> identification (the unique identifer of the file is its file handle),
> and actual usage -- the file handle is sufficient to locate and
> acquire the file contents.
>
That is interesting. I think this article on Cryptree (
http://www.dcg.ethz.ch/publications/srds06.pdf ) would be of particular
interest to you, if you haven't seen it before. It is used by another
dispersed storage service, Wuala ( http://www.wuala.com/ ).
>
>
> The resulting short string which serves as identifier, access control
> token, and file handle is called a "capability" or a "cap" for
> short. There are several kinds of capability in Tahoe-LAFS. The one
> that I've described above is a "read-cap to an immutable file".
>
How is the immutability enforced? In particular it isn't clear to me
how a write-cap allows one to update a file, is this something the
servers check via a digital signature or HMAC on updated data?
>
>
> Okay, my bus has arrived at work so I don't have time right now to
> describe the other ones, but please observe that this design so far
> already makes you start thinking about how you could build something
> cool on top of it. You can do so without having to think too much
> about how the ciphertext is stored (it is erasure-coded and spread
> across a distributed, fault-tolerant key-value storage grid), and
> without having to know too much about how other programs or other
> humans on the same system are managing their caps.
>
The freedom for the user to use anything is a double-edged sword. Those
having the right resources, infrastructure and know-how may be able to
store their keys in an extremely reliable and secure manner, but the
average user with less expertise or equipment may be left with a less
than ideal system for securing keys. The advantage of Tahoe-LAFS
approach is that its upper-bound for key management can be as high as
one is willing to make it, because it is not defined. The downside is
that the lower-bound can also be as low as one allows.
>
> We owe thanks to many others including the authors of Self-certifying
> filesystem, Freenet, Mojo Nation and especially the obj-cap ideas as
> expressed by Mark Miller.
>
> Regards,
>
> Zooko
> <http://allmydata.org/cgi-bin/mailman/listinfo/tahoe-dev>
>
Thanks for this follow-up post.
Regards,
Jason
More information about the tahoe-dev
mailing list