#869 new enhancement

Allow Tahoe filesystem to be run over a different key-value-store / DHT implementation — at Version 4

Reported by: davidsarah Owned by: nobody
Priority: major Milestone: undecided
Component: code-network Version: 1.5.0
Keywords: scalability performance forward-compatibility backward-compatibility availability newcaps docs anti-censorship Cc:
Launchpad Bug:

Description (last modified by davidsarah)

source:docs/architecture.txt describes Tahoe as comprising three layers: key-value store, filesystem, and application.

Most of what makes Tahoe different from other systems is in the filesystem layer -- the layer that implements a cryptographic capability filesystem. The key-value store layer implements (a little bit more than) a Distributed Hash Table, which is a fairly well-understood primitive with many implementations. The Tahoe filesystem and applications could in principle run on a different DHT, and it would still behave like Tahoe -- with different (perhaps better, depending on the DHT) scalability, performance, and availability properties, but with confidentiality and integrity ensured by Tahoe without relying on the DHT severs.

However, there are some obstacles to running the Tahoe filesystem layer on another DHT:

  • the code isn't strictly factored into layers (even though most code files belong mainly to one layer), so there isn't a narrow API between the key-value store and filesystem-related abstractions.
  • the communication with servers currently needs to be encrypted (independently of the share encryption), and other DHTs probably wouldn't support that.
  • because the filesystem has only been used with one key-value store layer up to now, it may make assumptions about that layer that haven't been clearly documented.

Note that even if the Tahoe code was strictly layered, we should still expect there to be some significant effort to port Tahoe to a particular DHT. The DHT servers would probably have to run some Tahoe code in order to verify shares, for example.

Change History (4)

comment:1 Changed at 2009-12-22T05:28:49Z by warner

Hmm, good points. This ties in closely to the docs outline that we wrote up (but which we haven't finished by writing the actual documentation it calls for): docs/specifications/outline.rst .

As you note, there are several abstraction-layer leaks which would need to be plugged or accomodated to switch to a general-purpose DHT for the bottom-most layer. Here are a few thoughts.

  • the main special feature that we require of the bottom-most DHT layer is support for mutable files. All of the *immutable*-file stuff is fairly standard DHT material. But to implement Tahoe's mutable files, we need a distributed slot primitive with capability-based access control: creating a slot should return separate read- and write- caps, and there should be some means of repairing shares without being able to forge new contents.
  • the only need for encrypted server connections is to support the shared-secret used to manage mutable-slot access control (which we'd like to get rid of anyways, because it makes share-migration harder, and it makes repair-from-readcap harder). If we had a different mechanism, e.g. representing slot-modify authority with a separate ECDSA private key per server*slot, then we could probably drop this requirement. (there is some work to do w.r.t. replay attacks and building a suitable protocol with which to prove knowledge of the private key, but these are well-understood problems).
  • on the other hand, the shared-secret slot-modify authority is nice and simple, is fast and easy for the server to verify (meaning a slow server can still handle lots of traffic), and doesn't require the server to have detailed knowledge of the share layout (which decouples server version from client version). Most of the schemes we've considered for signed-message slot-modify operations require the servers to verify the proposed new slot contents thoroughly, making it harder to deploy new share types without simultaneously upgrading all the servers.

There might also be some better ways of describing Tahoe's nominal layers, in a sense refactoring the description or shuffling around the dotted lines. I've been trying to write up a presentation using the following arrangement:

  • We could say that the lowermost layer is responsible for providing availability, reliability, and integrity: this layer has all the distributed stuff, erasure coding, and hashes to guard against corrupted shares, but you could replace it with a simple local lookup table if you didn't care about that sort of thing. This layer provides a pair of immutable operations (key=put(data) and data=get(key)), and a triple of mutable operations (writecap,readcap=create(), put(writecap,data), data=get(readcap)). The check/verify/repair operations work entirely at this level. All of the 'data' at this layer is ciphertext.
  • The next layer up gets you plaintext: the immutable operations are key=f(readcap), ciphertext=encrypt(key, plaintext), and plaintext=decrypt(key, ciphertext). The mutable operations are the same, plus something to give you the writecap-accessible-only column of a dirnode. If you didn't care about confidentiality, you could make these NOPs.
  • The layer above that gets you directories, and is mostly about serializing the childname->childcap+metadata table into a mutable slot (or immutable file). If you have some other mechanism to manage your filecaps, you could ignore this layer.
  • The layer above that provides some sort of API to non-Tahoe code, making all of the other layers accessible by somewhere. This presents operations like data=get(readcap), children=read(dircap), etc.

One way to look at Tahoe is in terms of that top-most API: you don't care what it does, you just need to know about filecaps and dircaps. Another view is about some client code, the API, the gateway node, and the servers that the gateway connects to: this diagram would show different sorts of message traversing the different connections. A third view would abstract the servers and the DHT/erasure-coding stuff into a lookup table, and focus on the crypto-and-above layers.

Last edited at 2014-03-03T01:14:03Z by daira (previous) (diff)

comment:2 Changed at 2010-01-20T07:10:09Z by davidsarah

  • Summary changed from Allow Tahoe filesystem to be run over a different grid/DHT implementation to Allow Tahoe filesystem to be run over a different key-value-store / DHT implementation

The "grid layer" is now called the "key-value store layer".

comment:3 Changed at 2010-03-25T00:10:17Z by davidsarah

  • Description modified (diff)

comment:4 Changed at 2010-03-25T00:11:41Z by davidsarah

  • Description modified (diff)
Note: See TracTickets for help on using tickets.