#87 closed enhancement (wontfix)

store less validation information in each share, to lower overhead

Reported by: warner Owned by:
Priority: minor Milestone: undecided
Component: code-encoding Version: 0.6.0
Keywords: encoding integrity Cc:
Launchpad Bug:

Description

Once we have confidence in our FEC and decryption code, we may feel comfortable removing the extra validation data from the shares. This would reduce our per-share storage overhead, and slightly reduce the per-file transmission overhead.

This would remove the plaintext hash (32B), the plaintext hash tree (32B * 2 * ceil(filesize/2MB)), the crypttext hash tree (same), and the crypttext hash (32B). For small files (less than 2MB), this would reduce the per-share overhead from 846 bytes to 718 bytes.

We would certainly want to implement #86 if we did this, to retain the ability to detect a mis-typed URI (using the wrong decryption key), since without a plaintext hash we'd have no other way to detect such corruption.

Change History (7)

comment:1 Changed at 2007-08-14T19:00:17Z by warner

  • Component changed from code to code-encoding
  • Owner somebody deleted

comment:2 Changed at 2007-09-25T04:36:19Z by zooko

  • Version changed from 0.4.0 to 0.6.0

comment:3 Changed at 2007-12-05T05:14:12Z by zooko

We've resolved #86 as wontfix (there isn't any danger of getting ciphertext back from tahoe if you start with the wrong encryption key, since the storage index is derived from the encryption key).

There is another issue: it would be nice to have validation of ciphertext -- separately from validation of shares -- so that someone could write a client which isn't capable of erasure decoding, but is capable of checking the validation of the ciphertext, and which connects to a server that does the erasure decoding for it and gives it the ciphertext.

Personally, I'm not motivated by this need. I want to make the tahoe client itself efficient, well-packaged, and well-behaved enough that people who want to download data from tahoe while retaining confidentiality of their files simply run a tahoe client.

Furthermore, even if we are going to support a non-erasure-decoding-but-ciphertext-validating (and perhaps therefore also ciphertext-decrypting) client in the future, I suspect it will be okay to add validation on the ciphertext back in when we know that we'll need it.

So I'd be happy at this point to move ahead with this and leave in only the parts that we currently need.

comment:4 Changed at 2008-06-01T20:52:53Z by warner

  • Milestone changed from eventually to undecided

comment:5 follow-up: Changed at 2008-06-10T23:05:48Z by zooko

We have already removed the plaintext hash and plaintext hash tree in order to avoid a failure of confidentiality.

comment:6 in reply to: ↑ 5 Changed at 2009-12-13T01:36:54Z by davidsarah

  • Keywords integrity added

Replying to zooko:

We have already removed the plaintext hash and plaintext hash tree in order to avoid a failure of confidentiality.

(#453 asks to put back a per-file (not per-share) plaintext hash, in order to improve integrity in case of any problem with the FEC decoding or decryption. Also ticket:658#comment:2 points out how this can be used to avoid redundant uploads/downloads.)

So the remaining part of this ticket asks to remove the per-share ciphertext hashes. However, I don't agree that it is a good idea to remove those: until we have #453, they are providing useful additional robustness in case of implementation error. Also, the saving for small files from removing them is only 64 bytes. Furthermore, without these hashes how would a share be fully verified by a verify cap holder, or the storage server? I suggest resolving wontfix.

comment:7 Changed at 2009-12-13T02:33:23Z by zooko

  • Resolution set to wontfix
  • Status changed from new to closed

Thanks, David-Sarah.

Note: See TracTickets for help on using tickets.