[tahoe-dev] overview of allmydata.org "Tahoe" security properties (confidentiality, integrity, availability)
zooko
zooko at zooko.com
Mon Jan 21 13:50:37 PST 2008
Dear tahoe-dev, Norm Hardy, and anonymous security experts:
I've just updated the Tahoe docs to have a "top-level overview" which
attempts to answer Norm Hardy's "initial questions" of November 2007
[1].
This overview document is also, I hope, the best starting point for
those security experts out there who are interested in attacking
Tahoe's security design in return for fame and gratitude and handsome
items of swag that say "I'm [INSERT NAME HERE] and I broke the Tahoe
secure filesystem design! Thank you from [HEART] allmydata.org".
I've Bcc:'ed two such security experts on this message.
The overview document explains what security properties we think we
have achieved, and leaves it up to architecture.txt [3] to explain
how we think that we have achieved them.
The overview is now on-line [2], and it is probably best viewed in
its original HTML format, but appended is a text version of it for
the record.
Regards,
Zooko
[1] http://allmydata.org/pipermail/tahoe-dev/2007-November/000222.html
[2] http://allmydata.org/source/tahoe/trunk/docs/about.html
[3] http://allmydata.org/trac/tahoe/browser/docs/architecture.txt
------- begin appended file about.html, mashed into text
Overview
A "storage grid" comprises a number of storage servers. A storage
server has local attached storage (typically one or more SATA hard
disks). A "gateway" uses the storage servers and provides to its
clients a filesystem over a standard protocol such as HTTP(S), FUSE,
or SMB.
Users do not rely on storage servers to provide confidentiality nor
integrity for the data -- instead all of the data is encrypted and
integrity checked by the gateway, so that the servers are not able to
learn anything about the data nor to alter it.
Users do rely on the storage servers for availability -- the
ciphertext is erasure-coded and distributed across N different
storage servers (the default value for N is 12) so that it can be
recovered from any K of these servers (the default value of K is 3).
Therefore only the simulaneous failure of N-K+1 (with the defaults,
10) servers can make the data unavailable. Phrasing this in terms of
reliance, we say that the users rely on the gateway for the
confidentiality and integrity of the data, and on any 3 of the 12
servers for the availability of the data.
The typical deployment mode is that each user runs her own gateway on
her own machine. This way she needs to rely only on her own machine
for the confidentiality and integrity of the data, and she can take
advantage of tighter filesystem interfaces such as FUSE and SMB.
An alternate deployment mode is that the gateway runs on a remote
machine and the user connects to it over HTTPS. This means that the
operator of the gateway can view and modify the user's data (the user
relies on the gateway for confidentiality and integrity), but it
means that the user can access the filesystem with a client that
doesn't have the gateway software installed, such as an Internet
kiosk or cell phone.
A user who has read-write access to a file or directory can give
another user read-write access to that file or directory, or can give
another user read-only access to that file or directory. A user who
has read-only access to a file or directory can give another user
read-only access to it.
When linking a file or directory into a parent directory, you can use
a read-write link or a read-only link. If you use a read-write link,
then anyone who has read-write access to the parent directory can
gain read-write access to the child, but anyone who has read-only
access to the parent directory can gain only read-only access to the
child. If you use a read-only link, then anyone who has either read-
write or read-only access to the parent directory can gain read-only
access to the child.
There are two kinds of files: immutable and mutable. Immutable files
have the property that once they have been uploaded to the storage
grid they can't be modified. Mutable ones can be modified.
For much more technical detail, please see The Doc Page on the Wiki,
and the other files in the docs directory of the source tree.
More information about the tahoe-dev
mailing list