[tahoe-dev] So how do *you* manage your keys, then? Re: cleversafe says: 3 Reasons Why Encryption isOverrated

Jason Resch jresch at cleversafe.com
Tue Aug 18 00:03:42 PDT 2009


Zooko Wilcox-O'Hearn wrote:
>
> On Monday,2009-08-10, at 11:56 , Jason Resch wrote:
>
> > You have stated how Cleversafe manages the key but not provided any 
> > details regarding how Tahoe-LAFS manages the decryption key?
>
> I think this is potentially Tahoe-LAFS's best contribution to the 
> state of the art, so I hope many of the readers of these lists will 
> think carefully about the following.
>
> The design of Tahoe-LAFS is to separate key management (== access 
> control) from data storage, and to make key management simple and 
> flexible.
>
> First, we boil down the key management problem for a given file or 
> directory to a single key, which is short (less than 100 bytes) so 
> that it is easier to manage.  This key suffices for both decryption 
> and integrity-checking.
>
Zooko,

I hope to not come off as overly critical.  I believe Tahoe has 
developed many interesting features and ideas, and its approach for 
encryption is better than many if not most other systems I am familiar 
with.  The problem I am about to point out is an almost universal 
problem among cryptosystems, so forgive me if I sound like I am picking 
on Tahoe-LAFS, that is not my intention.

On the topic of key management it seems rather than addressing the 
problem, Tahoe-LAFS offloads it to the end-user.  In a sense, Tahoe-LAFS 
is like a super-compression algorithm:  Send a file of any size through 
Tahoe-LAFS and get back a much smaller string of data.  Like a 
compression algorithm, the end-user is still responsible for reliably 
and securely storing the result.  The effect is the security and 
reliability of the stored data can never exceed that of the system the 
user stores their identifiers on.  Tahoe-LAFS sacrifices perhaps the 
greatest benefit of dispersed storage, the ultra-high reliability.

Being small, identifiers are easy to replicate and therefore are easy to 
store reliably.  However, having many copies of these highly 
confidential identifiers in different locations or on different media 
makes it much more likely that they will be compromised, reducing the 
confidentiality of the data below that of keeping only a single 
instance.  Users of Tahoe-LAFS are faced with a difficult choice:

1. Keep data highly confidential, by not making copies of the identifiers
2. Keep data highly reliable, by replicating the identifiers

Luckily there is a third option, which can achieve the best of both worlds:

3. Use a secret sharing scheme to attain reliable and confidential key 
storage

While secret sharing schemes are the ideal method for key storage, 
Tahoe-LAFS doesn't provide this feature and given its current design, 
cannot support it.  To have a secret sharing system, there must be some 
way to authenticate those who request shares.  In the case of 
Tahoe-LAFS, as I understand it, the authentication (or access control) 
key is the very secret one would try to secure using the secret sharing 
scheme.

The only way to achieve the ultra-high reliability which information 
dispersal allows is to make sure keys are stored as reliably as the 
data.  A corollary of this is that the authentication credentials used 
to access the keys must be something which is revocable and replaceable 
when lost, otherwise there is an infinite regress of how to protect the 
credentials from loss.  Cleversafe has adopted this approach, using 
replaceable authentication credentials and a secret sharing scheme to 
store both the key and data.  I should note that nothing in our approach 
precludes someone from encrypting their data and taking responsibility 
for the management of that key, but doing so sets an upper-bound on 
reliability, equal to that of the key management system.

>
> Second, we make a separate, independent key for every single file or 
> directory.  This means that access control decisions such as "Should 
> I share this file with my friend?" don't have to be linked to access 
> control of other files or directories.  (Although they *can* be 
> bundled together if desired.)
>
There is flexibility in having a separate key for each file but it also 
means one needs to take the time to make backups of the newly generated 
keys following each upload session.  Although I suppose if one restricts 
uploads to children of directories whose key is already backed up this 
step can be avoided.  Is that correct?
>
>
> Third, we *embed the key directly into the identifier of the file*.  
> This part is important.  You know how in a filesystem, whether local 
> or distributed, files have a unique "file handle" or identifier?  In 
> a traditional Unix filesystem it is the inode number.  Like a Unix 
> directory, a Tahoe-LAFS directory consists of a map from the name of 
> each child to the file handle to that child.  The critical decision 
> here is to embed the crypto key directly into that handle.  The 
> result is that when some human or some program wants to give anothe 
> human or program access to a Tahoe-LAFS file or directory, it does so 
> by giving the file handle.  This single value serves for access 
> control (you can't decrypt the file if you don't have it), 
> identification (the unique identifer of the file is its file handle), 
> and actual usage -- the file handle is sufficient to locate and 
> acquire the file contents.
>
That is interesting.  I think this article on Cryptree ( 
http://www.dcg.ethz.ch/publications/srds06.pdf ) would be of particular 
interest to you, if you haven't seen it before.  It is used by another 
dispersed storage service, Wuala ( http://www.wuala.com/ ).
>
>
> The resulting short string which serves as identifier, access control 
> token, and file handle is called a "capability" or a "cap" for 
> short.  There are several kinds of capability in Tahoe-LAFS.  The one 
> that I've described above is a "read-cap to an immutable file".
>
How is the immutability enforced?  In particular it isn't clear to me 
how a write-cap allows one to update a file, is this something the 
servers check via a digital signature or HMAC on updated data?
>
>
> Okay, my bus has arrived at work so I don't have time right now to 
> describe the other ones, but please observe that this design so far 
> already makes you start thinking about how you could build something 
> cool on top of it.  You can do so without having to think too much 
> about how the ciphertext is stored (it is erasure-coded and spread 
> across a distributed, fault-tolerant key-value storage grid), and 
> without having to know too much about how other programs or other 
> humans on the same system are managing their caps.
>
The freedom for the user to use anything is a double-edged sword.  Those 
having the right resources, infrastructure and know-how may be able to 
store their keys in an extremely reliable and secure manner, but the 
average user with less expertise or equipment may be left with a less 
than ideal system for securing keys.  The advantage of Tahoe-LAFS 
approach is that its upper-bound for key management can be as high as 
one is willing to make it, because it is not defined.  The downside is 
that the lower-bound can also be as low as one allows.

>
> We owe thanks to many others including the authors of Self-certifying 
> filesystem, Freenet, Mojo Nation and especially the obj-cap ideas as 
> expressed by Mark Miller.
>
> Regards,
>
> Zooko
> <http://allmydata.org/cgi-bin/mailman/listinfo/tahoe-dev>
>
Thanks for this follow-up post.

Regards,

Jason



More information about the tahoe-dev mailing list