[tahoe-dev] how to squeeze a CHK URI
zooko
zooko at zooko.com
Fri Sep 21 11:46:11 PDT 2007
Regarding the CHK URI scheme, which is colorfully diagrammed here:
http://allmydata.org/trac/tahoe/attachment/wiki/Doc/chk.jpg
(But it needs a few touch-ups, as per this message: http://
allmydata.org/pipermail/tahoe-dev/2007-September/000151.html )
There are some changes to this scheme that we might like to make in
the future.
First of all, we would like to squeeze the CHK URI itself to make it
easier to store/transport lots of them either with automation or
manually by a user. Currently the CHK URI has a 128 bit encryption
key and 256 bit hash of URI-extension.
We'd like to ask: how small can we make these crypto values? As an
extreme example of convenience at a potential cost in security, we
could have encryption keys be a 80 bits and the URI-extension hash be
65 bits. I don't think I would be comfortable with smaller crypto
values than that, and I'm not sure I would be comfortable with them
that small. (You may wonder why 65 bits instead of 64. Well, it's
one better, isn't it? Also, when the value is base32 encoded then
you pay the cost of the final character whether that final (13th)
char holds 5 bits or only 4.)
For reference, here is a CHK from the current version of Tahoe:
URI:CHK:waca7jp68eduw4w7tz799b8s8c:u3dw4cp9dbu58bysca8zi48bu578kr9hob9jc
zgktwynbjddawuo:3:10:2164632
And here it is encoded into a URL that could in theory be clicked on
by another Tahoe user:
http://localhost:8081/uri/URI%3ACHK%3Awaca7jp68eduw4w7tz799b8s8c%
3Au3dw4cp9dbu58bysca8zi48bu578kr9hob9jczgktwynbjddawuo%3A3%3A10%
3A2164632?filename=Ravel_Gaspard-de-la-Nuit_Scarbo_Arno-Waschk.mp3
And here is what one would look like if it had 135 bits of crypto
material in it instead of 384 bits:
URI:CHK:waca7jp68eduw4:u3dw4cp9dbu58bys:3:10:2164632
http://localhost:8081/uri/URI%3ACHK%3Awaca7jp68eduw4%
3Au3dw4cp9dbu58bys%3A3%3A10%3A2164632?filename=Ravel_Gaspard-de-la-
Nuit_Scarbo_Arno-Waschk.mp3
Another way to squeeze CHK URIs would be to move the encoding
parameters (K and N) and the filesize from the URI to the URI
extension block. I know that the encoding parameters formerly had to
be in the URI because we used them (along with the storage index) for
locating shares in the "tahoe3" peer selection algorithm [1]. I
vaguely recall that Brian wanted filesize in the URI in order to make
it easier for some kind of user to know the filesize without having
to fetch a URI extension block, but now I'm not sure if that is
correct. Brian: why do we have filesize in the URI?
Anyway, if we could remove those two from the URI, then it would look
like this:
URI:CHK:waca7jp68eduw4:u3dw4cp9dbu58bys
http://localhost:8081/uri/URI%3ACHK%3Awaca7jp68eduw4%
3Au3dw4cp9dbu58bys?filename=Ravel_Gaspard-de-la-Nuit_Scarbo_Arno-
Waschk.mp3
Yet another way to squeeze URIs is to remove the scheme and
separators, leaving just the uncompressible bits:
waca7jp68eduw4u3dw4cp9dbu58bys
http://localhost:8081/uri/waca7jp68eduw4u3dw4cp9dbu58bys?
filename=Ravel_Gaspard-de-la-Nuit_Scarbo_Arno-Waschk.mp3
(See also ticket #102, which is on the topic of compressed and/or URL-
safe directory URIs rather than CHK URIs.)
Okay, that's about as compressed as I can make it!
Now, supposed we didn't want to go for 65-bit hashes and 80-bit
keys. Those are definitely smaller than the size recommended by the
current conventional wisdom of cryptographers. We could have, let's
say, 67-bit hashes and 128-bit keys:
waca7jp68eduw4w6e7gudxaac63aefugb77qtac
http://localhost:8081/uri/waca7jp68eduw4w6e7gudxaac63aefugb77qtac?
filename=Ravel_Gaspard-de-la-Nuit_Scarbo_Arno-Waschk.mp3
Regards,
Zooko
[1] http://allmydata.org/trac/tahoe/wiki/PeerSelection
tickets mentioned in this message:
http://allmydata.org/trac/tahoe/ticket/102
More information about the tahoe-dev
mailing list