[tahoe-dev] Privacy of data when stored on allmydata.com
Brian Warner
warner-tahoe at allmydata.com
Tue Feb 3 17:44:35 PST 2009
On Tue, 3 Feb 2009 12:13:39 +1100
Andrej Falout <andrej at falout.org> wrote:
> I was told that "that for the production site, we need your root cap in
> order to do accounting which implicitly means that we have access to all of
> your files. We plan on changing this going forward, but at the moment you
> will have to rely upon an external encryption mechanism if you want to
> secure data from us."
So, here's the deal:
* Tahoe is very carefully designed to protect the integrity and
confidentiality of the data stored therein. If you don't have the
filecap/dircap, you can't read the file.
* However, there are some maintenance tasks that cannot yet be performed
without a readcap. In particular, to measure how much space is being used
by a given directory (and its children), we need to traverse all the files
and directories and add up their sizes. In addition, to perform file
checking and repair, we need to get a repaircap for each object, and
currently directories can only be repaired with writecaps.
* allmydata.com needs to be able to measure space used by each customer, to
support our business needs (i.e. to sell a 5GB account and hold them to
it)
* allmydata.com needs to be able to check+repair customer files, to keep
files reliable over long periods of time without user involvement
* So, until we finish the Tahoe work that allows these maintenance tasks to
be done with less powerful access, allmydata.com needs access to those
rootcaps
Once the Accounting project is done, we'll be able to measure backend space
used by each customer without needing to traverse the whole directory
structure, so private customer directories won't cause us provisioning
problems.
And once we implement DSA-based mutable files (with traversal caps), we'll be
able to get a set of verifycaps/repaircaps without being able to see the
plaintext of either files or directories. So once those tasks are done,
allmydata.com will merely hold a traversalcap for each customer, rather than
a writecap. Customers who maintain private directories can either grant us
the traversal cap (so we can do check+repair on their files for them), or
they will be obligated to perform periodic file checking/repair on those
private files by themselves.
Finally, as a consumer-oriented backup service, most of our customers expect
us to be able to recover their files for them even if they've forgotten their
allmydata.com password. The usual authority model is that if you have the
same credit card that is paying for the account, and if you can receive email
at the main account address, then you get to see the files. For this reason,
most of our customers expect us to hold on to their rootcaps, even though
that means we have the ability to see their files.
Once Accounting and traversal caps are done, we can give our customers a
choice between password-recovery and privacy. The installer will have a
checkbox that explains the options and lets them choose whether to give us a
rootcap or not.
But, in the meantime, if you want to buy storage space from allmydata.com but
also want to make sure that allmydata.com can't see the contents of your
files, then you need to supply a second layer of encryption. The duplicity
plugin described by Francois is an excellent way to do this, although of
course you won't be able to use the www.allmydata.com webdrive to view the
unencrypted contents. Some backup schemes will aggregate files together in a
way that makes differential backups more difficult or less efficient.. I
don't know how duplicity works, but I'm sure there are some schemes that
would provide the properties you're looking for.
I hope that explains some of our design decisions and what your current
options are..
cheers,
-Brian
More information about the tahoe-dev
mailing list