| | 44 | |
| | 45 | == Filecap Length == |
| | 46 | |
| | 47 | A likely security parameter K (=kappa) would be 96 or 128 bits, and most of |
| | 48 | the filecaps will be some multiple of K. |
| | 49 | |
| | 50 | Assuming a {{{tahoe:}}} prefix and no additional metadata, here's what |
| | 51 | various lengths of base62-encoded filecaps would look like: |
| | 52 | |
| | 53 | * 1*K: |
| | 54 | * 96 {{{tahoe:14efs6T5YNim0vDVV}}} |
| | 55 | * 128 {{{tahoe:4V2uIYVX0PcHu9fQrJ3GSH}}} |
| | 56 | * 2*K: |
| | 57 | * 192 {{{tahoe:072Og6e75IOP9ZZsbR1Twjs6X5xXJnBAF}}} |
| | 58 | * 256 {{{tahoe:fZeioazoWrO62reiAjzUAyV0uz3ssh6Hnanv8cKMClY}}} |
| | 59 | * 3*K: |
| | 60 | * 288 {{{tahoe:11DriaxD9nipA10ueBvv5uoMoehvxgPerpQiXyvMPeiUUdtf6}}} |
| | 61 | * 384 {{{tahoe:3a31SqUbf8fpWE1opRCT3coDhRqTU7bDU2AvC3RQJBu6ZNFhVscyxA9slYtPVT79x}}} |
| | 62 | |
| | 63 | Adding 2 metadata characters and a clear separator gives us: |
| | 64 | |
| | 65 | * 96: {{{tahoe:MW-14efs6T5YNim0vDVV}}} |
| | 66 | * 128: {{{tahoe:DW-4V2uIYVX0PcHu9fQrJ3GSH}}} |
| | 67 | * 192: {{{tahoe:MR-072Og6e75IOP9ZZsbR1Twjs6X5xXJnBAF}}} |
| | 68 | * 256: {{{tahoe:DR-fZeioazoWrO62reiAjzUAyV0uz3ssh6Hnanv8cKMClY}}} |
| | 69 | * 288: {{{tahoe:MR-11DriaxD9nipA10ueBvv5uoMoehvxgPerpQiXyvMPeiUUdtf6}}} |
| | 70 | * 384: {{{tahoe:MR-3a31SqUbf8fpWE1opRCT3coDhRqTU7bDU2AvC3RQJBu6ZNFhVscyxA9slYtPVT79x}}} |
| | 71 | |
| | 72 | = Design Proposals = |
| | 73 | |
| | 74 | == Commonalities == |
| | 75 | |
| | 76 | * once we get the ciphertext, it gets segmented and erasure-coded in the |
| | 77 | same way as immutable files. Shares include a merkle tree over the share |
| | 78 | blocks, and a second one over the ciphertext segments. |
| | 79 | * we'd like to add a merkle tree over the plaintext, without reintroducing |
| | 80 | the partial-information-guessing attack that prompted us to remove it. |
| | 81 | This means encrypting the nodes of this merkle tree with a key derived |
| | 82 | from the readcap. |
| | 83 | * We'll continue to use the signing layout of the current mutable files: |
| | 84 | there will be a UEB that includes the root of the hash trees (and |
| | 85 | everything else in the share), which will be hashed to compute the |
| | 86 | "roothash" (which changes with each publish). A block of data that |
| | 87 | includes the roothash and a sequence number (as well as any |
| | 88 | data-encrypting salt) will be signed. |
| | 89 | * It might be good to make the layout a bit more extensible, like the way |
| | 90 | that immutable files have a dictionary-like structure for the UEB. |
| | 91 | * In general, the share will always contain a full copy of the pubkey, for |
| | 92 | the benefit of server-side validation. I don't think it matters whether |
| | 93 | the pubkey is stored inside or outside of the signed block, but it will |
| | 94 | probably make the upload-time share-verification code simpler to put it |
| | 95 | inside. |
| | 96 | * In general, the storage-index will be equal to the pubkey. If the pubkey |
| | 97 | is too long for this, the storage-index will be a sufficiently-long secure |
| | 98 | hash of the pubkey. The SI must be long enough to meet our |
| | 99 | collision-resistance criteria. |
| | 100 | |
| | 101 | == ECDSA, semi-private keys, no traversalcap == |
| | 102 | |
| | 103 | Zooko captured the current leading semi-private-key-using mutable file design |
| | 104 | nicely in the [http://allmydata.org/~zooko/lafs.pdf "StorageSS08" paper] |
| | 105 | (in Figure 3). The design is: |
| | 106 | |
| | 107 | * (1K) writecap = K-bit random string (perhaps derived from user-supplied |
| | 108 | material) (remember, K=kappa, probably 128bits) |
| | 109 | * (2K) readcap = 2*K-bit semiprivate key |
| | 110 | * (2K) verifycap = 2*K-bit public key |
| | 111 | * storage-index = truncated verifycap |
| | 112 | |
| | 113 | On each publish, a random salt is generated and stored in the share. The data |
| | 114 | is encrypted with H(salt, readcap) and the ciphertext stored in the share. We |
| | 115 | store the usual merkle trees. |
| | 116 | |
| | 117 | This provides offline attenuation and full server-side validation. It removes |
| | 118 | the need to pull a copy of the pubkey or encprivkey from just one of the |
| | 119 | servers (the salt must still be fetched, but it's small and lives in the |
| | 120 | signed block that must be fetched anyways). |
| | 121 | |
| | 122 | === add traversalcap === |
| | 123 | |
| | 124 | Like above, but create two levels of semiprivate keys instead of just one: |
| | 125 | |
| | 126 | * (1K) writecap = K-bit random string |
| | 127 | * (2K) readcap = 2*K-bit first semiprivate key |
| | 128 | * (2K) traversalcap = 2*K-bit second semiprivate key |
| | 129 | * (2K) verifycap = 2*K-bit public key |
| | 130 | * storage-index = truncated verifycap |
| | 131 | |
| | 132 | The dirnode encoding would use H(writecap) to protect the child writecaps, |
| | 133 | H(readcap) to protect the child readcaps, and H(traversapcap) to protect the |
| | 134 | child verifycap/traversalcaps. |
| | 135 | |
| | 136 | == ECDSA, no semi-private keys, no traversalcap == |
| | 137 | |
| | 138 | Without semi-private keys, we need something more complicated to protect the |
| | 139 | readkey: the only thing that can be mathematically derived from the writecap |
| | 140 | is the pubkey, and that can't be used to protect the data because it's public |
| | 141 | (and used by the server to validate shares). One approach is to use the |
| | 142 | current (discrete-log DSA) mutable file structure, and merely move the |
| | 143 | private key out of the share and into the writecap: |
| | 144 | |
| | 145 | * (1K) writecap = K-bit random string |
| | 146 | * (3K) readcap = H(writecap)[:K] + H(pubkey) |
| | 147 | * (2K) verifycap = H(pubkey) |
| | 148 | * storage-index = truncated verifycap |
| | 149 | |
| | 150 | In this case, the readcap/verifycap holder is obligated to fetch the pubkey |
| | 151 | from one of the shares, since they cannot derive it themselves. This |
| | 152 | preserves offline attenuation and server-side validation. The readcap grows |
| | 153 | to (1+2)*K : we can truncate the AES key since we only need K bits for K-bit |
| | 154 | confidentiality, but require 2*K bits on H(pubkey) to attain K-bit collision |
| | 155 | resistance. The verifycap is 2*K. |
| | 156 | |
| | 157 | === include pubkey in cap === |
| | 158 | |
| | 159 | Or, if the pubkey is short enough, include it in the cap rather than |
| | 160 | requiring the client to fetch a copy: |
| | 161 | |
| | 162 | * (1K) writecap = K-bit random string |
| | 163 | * (3K) readcap = H(writecap)[:K] + pubkey |
| | 164 | * (2K) verifycap = pubkey |
| | 165 | * storage-index = H(pubkey) |
| | 166 | |
| | 167 | I think ECDSA pubkeys are 2*K long, so this would not change the length of |
| | 168 | the readcaps. It would just simplify/speed-up the download process. If we |
| | 169 | could use shorter hashes, then the H(pubkey) design might give us slightly |
| | 170 | shorter keys. |
| | 171 | |
| | 172 | === add traversalcap === |
| | 173 | |
| | 174 | Since a secure pubkey identifier (either H(pubkey) or the original privkey) |
| | 175 | is present in all caps, it's easy to insert arbitrary intermediate levels. It |
| | 176 | doesn't even change the way the existing caps are used: |
| | 177 | |
| | 178 | * (1K) writecap = K-bit random string |
| | 179 | * (3K) readcap = H(writecap)[:K] + H(pubkey) |
| | 180 | * (3K) traversalcap: H(readcap)[:K] + H(pubkey) |
| | 181 | * (2K) verifycap = H(pubkey) |
| | 182 | * storage-index = truncated verifycap |