#204 new enhancement

"virtual CDs"

Reported by: warner Owned by:
Priority: major Milestone: 2.0.0
Component: code-dirnodes Version: 0.6.1
Keywords: dirnodes newcaps performance random-access space-efficiency tahoe-backup Cc:
Launchpad Bug:


We've kicked around the idea of low-overhead read-only immutable directory trees. The original "filetree" design included "realms" of dirnodes, in which each realm was a serialized tree of dirnodes and filenodes. In that (abandoned) design, there were both mutable and immutable realms. The biggest problem (apart from the complexity) was in figuring out how to share individual subtrees without giving access to parent dirnodes in the same realm.

When we went to actually implement dirnodes, we opted for a much simpler approach, in which each dirnode is stored in exactly one slot. This removed most of the complexity and allowed us to produce a useable vdrive with the fine-grained access control semantics that we wanted, but increases the overhead of storing and accessing dirnodes, especially if those dirnodes are being accessed through read-only capabilities.

One option I'd like to explore is to serialize a whole tree of directories into a single storage unit. This would reduce the access and storage overhead (fewer peers to contact to traverse the tree) at the expense of making sharing more complicated (to share a subtree, you have to copy it out into a new structure).

Zooko has been pointing two things out to me for months now that only recently sunk in.

  • most directories are probably only going to be accessed by a single user, so it might be ok to put off some work until someone actually tells us that they want to carve off a subtree (i.e. create a new get_shared_uri() method, and allow it to return a Deferred and do more work). In addition many shared directories are going to be shared in a read-only fashion.
  • there is benefit to putting related pieces of information in the same place, specifically that a collection of dirnodes that mainly exist for the benefit of a single user could be placed on the same servers or in the same storage slots.

In addition, we've been thinking for a while that a read-only immutable tree of dirnodes would be a useful data type. We've been calling this "burning a Virtual CD", since that phrase expresses the immutability properties pretty accurately.

So what I'm thinking now is that these "virtual CDs" should be serialized into a single data structure, but that the fine-grained access control should be implemented by putting a different encryption key on each internal dirnode, and the child keys should be hash-derived from the parent keys. The read-capability URI that references this structure should include the CHK identification information for the structure as a whole, plus the offset+length and encryption key for the individual dirnode being referenced.

This would make fine-grained sharing easy. If we improve the CHK download code to allow random access, it would limit the amount of data that needs to be transferred to a single segment plus hash overhead (and we'd probably want to use a smaller segsize for these structures). By keeping the data structure immutable, we don't need to worry about those URIs becoming stale or invalidated by data changes after they've been minted. The URIs might be a bit long (CHK length plus extra stuff), but it might be possible to use the hash of the dirnode (which includes the hashes of all its children) instead of the usual UEB hash (although that would probably make it hard to isolate corrupted shares.. must think about this further).

Change History (9)

comment:1 Changed at 2008-03-08T00:43:59Z by warner

  • Summary changed from immutable dirnodes to immutable dirnodes, "virtual CDs"

comment:2 Changed at 2008-04-24T23:50:45Z by warner

  • Component changed from code-encoding to code-dirnodes

comment:3 Changed at 2008-06-01T20:39:38Z by warner

  • Milestone changed from eventually to undecided

comment:4 Changed at 2009-02-25T22:54:56Z by zooko

  • Summary changed from immutable dirnodes, "virtual CDs" to "virtual CDs"

comment:5 Changed at 2009-10-28T07:20:54Z by davidsarah

  • Keywords newcaps added

Tagging issues relevant to new cap protocol design.

comment:6 Changed at 2009-12-20T17:12:00Z by davidsarah

  • Keywords performance added

The minimum readcap size for an internal node of a virtual CD (just counting crypto fields), would be the size of a collision-resistant hash plus a server selection index. This minimum can be achieved if a server indexes all the internal nodes in the CDs it is storing. For example, each internal node can have its own SI, which points to the CD's bucket and the offset/length of the node in the CD.

comment:7 Changed at 2009-12-20T17:40:27Z by davidsarah

  • Keywords random-access added

comment:8 Changed at 2010-04-25T20:59:31Z by davidsarah

  • Keywords space-efficiency added

comment:9 Changed at 2010-04-25T21:02:42Z by davidsarah

  • Keywords tahoe-backup added; vdrive removed
  • Milestone changed from undecided to 2.0.0

Apparently many source control systems (svn, darcs, git) use lots of small files, so backing up a source repository is very inefficient at the moment. This ticket could probably help with that, if tahoe backup used virtual CDs.

Note: See TracTickets for help on using tickets.