#3 closed enhancement (fixed)

CHK-URIs: derive storage index from readkey to make the URI shorter

Reported by: warner Owned by: warner
Priority: minor Milestone: 0.5.0
Component: code Version: 0.4.0
Keywords: Cc:
Launchpad Bug:

Description (last modified by warner)

The URI currently contains separate readkey and StorageIndex fields. We should redefine the read-cap CHK file URI to include just the readkey and derive the StorageIndex from it by hashing.

Change History (6)

comment:1 Changed at 2007-04-28T19:17:51Z by warner

  • Component changed from component1 to code

comment:2 Changed at 2007-06-29T18:49:20Z by warner

  • Description modified (diff)
  • Summary changed from URIs are too big to URIs could be a bit smaller
  • Version set to 0.3.0

We've addressed the most immediate problem here, by moving many of the pieces off to the URIExtension, and including the hash of that datablock in the URI itself. This makes the URIs smaller, at the expense of increasing the storage overhead slightly (about 200 bytes per share), and increasing the alacrity slightly (you have to pull 200 bytes from one shareholder before you can verify the first segment).

Once we also switch to deriving the StorageIndex from the readkey, this will shrink the URI down to the following pieces:

  • readkey (base32-encoded 16 or 32 byte value)
  • URIExtension hash (base32-encoded 32 byte value)
  • needed_shares/total_shares (two small integers, normally "25" and "100")
  • filesize (3-7 bytes, really just for quicker UI purposes)

(at the moment, we track the readkey and the storage index separately, so our URIs are another 53 characters longer than this)

I'm redefining this ticket to be about reducing the size of the URI, by deriving the StorageIndex from the readkey. The issue of algorithmically generating things like segment size and encoding parameters from the filesize is less important, in my opinion, now that it's been pushed out to the URIExtension.

comment:3 Changed at 2007-07-02T19:46:31Z by warner

  • Milestone set to release 1.0
  • Summary changed from URIs could be a bit smaller to CHK-URIs: derive storage index from readkey to make the URI shorter
  • Version changed from 0.3.0 to 0.4.0

comment:4 Changed at 2007-07-02T19:46:38Z by warner

  • Type changed from defect to enhancement

comment:5 Changed at 2007-07-22T01:24:44Z by warner

  • Milestone changed from release 1.0 to 0.5.0
  • Owner changed from somebody to warner
  • Status changed from new to assigned

I decided to go ahead and do this now 81a99044554f72ef, since I changed the URI header anyways (from a bare "URI:" to "URI:CHK:") in the process of refactoring URI processing.

This brings the URI for a 28kB file down from 165 characters to 108.

We still need to talk about some crypto stuff: we certainly want the storage index to be unique, and it might be nice to have it be unguessable, and we should think about how the Birthday Attack impacts this. Given that there's half as many bits in the readkey as there was in the storage index, we're working with less entropy than we used to, and it might be sensible to put a 32-byte value into the URI, truncate it for use as the readkey, and hash the whole thing to generate the storage index.

comment:6 Changed at 2007-07-24T18:20:19Z by warner

  • Resolution set to fixed
  • Status changed from assigned to closed

We decided to truncate the storage index to the same 128 bits that are present in the AES key that it's derived from, to make it clear that we understand our basic information theory.

finally closing this one..

Note: See TracTickets for help on using tickets.