#684 closed enhancement (wontfix)

let client specify the encryption key

Reported by: zooko Owned by: warner
Priority: major Milestone: eventually
Component: code-encoding Version: 1.4.1
Keywords: review Cc: swillden
Launchpad Bug:

Description

Per this tahoe-dev discussion, Shawn Willden has submitted a patch to allow the client to choose the encryption key for an immutable file upload. This is a very dangerous feature, because Tahoe doesn't use unique IVs under the hood, therefore you lose confidentiality if you ever ask Tahoe to use the same encryption key twice.

We could make this less dangerous by using random IVs.

Anyway, it is unfortunate that we didn't pay attention to Shawn patch until now, simply because there wasn't a ticket for it. So now there is.

This is one of the requirements to implement #320 (add streaming upload to HTTP interface), which is a ticket that I would love to see fixed.

Attachments (3)

client-keys.dpatch (187.1 KB) - added by warner at 2009-05-05T21:12:16Z.
Shawn's client-choose-encryption-key patch
new-684.diff (19.1 KB) - added by warner at 2009-05-16T22:14:14Z.
fix up patch to apply against trunk as of 15-May-2009
new2-684.diff (19.2 KB) - added by warner at 2009-05-16T23:29:10Z.
new version of the patch, with minor stylistic changes applied

Download all attachments as: .zip

Change History (10)

comment:1 Changed at 2009-04-22T19:18:00Z by swillden

I think it's a little worse than just loss of confidentiality. Since the storage ID is derived from the key, won't adding two files with the same key cause the first one to be lost? Or will the storage servers just refuse to accept another share of the same SID? If that's the case, and new servers have been added to the grid, it's possible that shares of the second file could be stored and then when the client tries to download the file it gets a mixture of shares from the two files... essentially losing both.

Clients should only set their own encryption key if they use another mechanisms to ensure that a given encryption key is only used once.

comment:2 Changed at 2009-04-22T23:19:02Z by warner

Re-using a key (which is equivalent to re-using a storage-index) will probably cause the second file to get lost. The uploading code will ask the storage servers to accept a share for storage-index XYZ, the server will say "I already have a share for that", the uploader will say "oh, ok, I'll use the existing share instead", and the result will be that the uploader will construct a filecap that won't be downloadable, and the second file will be lost.

Actually, in the current release, this depends upon whether any servers have been added since the first upload. If the first part of the permuted list is still the same, then the uploader will re-use the existing shares. If the uploader has a different peer list, then it might place new shares anyways, because our uploader code isn't very smart about avoiding duplicate shares yet (something that needs to improve to make the Repairer work better). In this case, you could wind up with some shares for the first version and some shares for the second version. In theory, both versions might wind up being downloadable: someone using the first filecap would see good shares and corrupt shares, someone using the second filecap would see corrupt shares and good shares. The wrinkle is that the downloader is not yet clever enough to switch to good shares once it sees corruption, so the most likely effect would be both versions getting bad-hash errors on download.

Eventually, the uploader code needs to be improved: if a server claims to have a share, the uploader should check to see that the share looks usable. The simplest check is to compare file lengths at the start of upload and UEB hashes at the end of upload; a more complete check would be to download the whole share and compare it to the one being generated, or challenge the server to return a keyed hash of the share and do the same comparison.

But yeah, in any case, it becomes important for the uploader to avoid re-using an encryption key. At best it will cause confusion.

comment:3 Changed at 2009-05-01T16:10:52Z by warner

  • Keywords review added
  • Owner set to warner
  • Status changed from new to assigned

Changed at 2009-05-05T21:12:16Z by warner

Shawn's client-choose-encryption-key patch

Changed at 2009-05-16T22:14:14Z by warner

fix up patch to apply against trunk as of 15-May-2009

comment:4 Changed at 2009-05-16T22:23:09Z by warner

A few style questions:

  • I'd prefer to use a marker class instead of "random" to indicate that a randomly-generated key is desired. This reduces the potential for confusing the "please generate a random key" directive with "please use the string 'random' as the key". I'd just use "class Random: pass", and then the receiving code can do "if thing is Random".
  • I'm not sure that giving folks three ways to specify the key (base32, base16, base16-with-"hex"-prefix) is a good idea.. think of Python's "There should only be one way to do it" rule. On the other hand, I don't know which one I'd drop: we use base32 everywhere so it'd be a shame to drop it, but if I were a webapi client writer I might find it a lot easier to generate a hex string than to dig up a base32 library somewhere. So, maybe it'd be best to leave the options as they stand.
  • really minor style things: I tend to not put spaces around the "=" in the default argument values in def() statements (maybe this should be called "cuddled default args"?). And I tend to use ", " as the import-statement delimiter instead of "," (non-cuddled imports?).

otherwise, it looks pretty good.. I'll probably apply it to trunk soon.

Changed at 2009-05-16T23:29:10Z by warner

new version of the patch, with minor stylistic changes applied

comment:6 Changed at 2009-05-17T14:32:06Z by zooko

How about this for the warning paragraph:

"Be VERY careful that you know what you're doing if you use this feature. Choosing bad keys could compromise the security of your files. Also, a key MUST NOT be used more than once. If the same key is ever used more than once -- whether on more than one file or even on the same file with more than one set of FEC parameters (K, N, segsize) -- it will expose the cleartext of the files it was used on, as well as causing uploads of those files to fail even while indicating successful upload."

comment:7 Changed at 2009-06-10T16:36:14Z by zooko

  • Resolution set to wontfix
  • Status changed from assigned to closed

Per this discussion: http://allmydata.org/pipermail/tahoe-dev/2009-May/001817.html , we're not going to support the client specifying the encryption key for now. Thanks for the patch and requirements discussion, Shawn.

Note: See TracTickets for help on using tickets.