[tahoe-dev] safety and Tahoe Lock Files
zooko
zooko at zooko.com
Mon Mar 3 18:48:30 PST 2008
following-up to my own post
On Mar 3, 2008, at 5:20 PM, zooko wrote:
> When you give someone a write-cap to a mutable file-or-directory, M1,
> which you yourself are also intending to write into in the future,
> you also give them a write-cap to a mutable Tahoe lockfile, L1.
>
> Thereafter, whenever you want to write to M1, you first read L1 to
> see if it is currently locked. If L1 is empty (zero length), then M1
> is currently unlocked.
>
> To lock M1, you pick a random 32-byte string and write that string
> into L1.
A good question to ask about this proposal is: why incur all the
overhead of using a separate lockfile L1, when you could just,
instead of attempting to write your random lock string into L1,
attempt to write your actual data into M1?
The answer is that there is a tiny but unavoidable chance that an
uncoordinated write to a Tahoe mutable file or directory will destroy
both the old and new contents of that file or directory, resulting in
permanent data loss. This chance is quite remote -- the only way it
could happen is if there were an unfortunate coincidence of servers
failing or getting disconnected from the network at the same time as
the writers failed or got disconnected from the network, and even
then it would happen only if a couple timing patterns fell out the
wrong way. However, the more simultaneous uncoordinated writers
there are writing to a given mutable file or directory and the more
frequently they simultaneously write, then the more opportunities
there are for an unlucky pattern of sudden network outages and
crashes to cause permanent data loss.
We're strongly averse to even small risk of permanent data loss, and
we would like to be able to say:
"""
Data Safety Guarantee: No matter what pattern of network outages
occur, and no matter if your clients crash in the middle of
performing writes, and no matter if a limited subset of the servers
crash, have internal errors, or turn out to be subverted by
criminals, then there is *still* zero possibility of permanent data
loss, as long as there are at least K well-behaving servers left
which have shares of one version of your file.
"""
Note: to formulate this safety guarantee precisely, you have to think
about how certain patterns of network outages and server failures
could make it be the case that some servers have received the
previous version of your file while others have received the new
version. Analyzing this safety guarantee in terms of which set of
servers is well-behaved and is available to your writer at what times
is a difficult but feasible task. No such guarantee can be offered
in the presence of an unbounded number of simultaneous uncoordinated
writes -- when the Tahoe storage servers are under such a load then
it is always possible to permanently lose data due to an unlucky
pattern of network disconnections.
Now, if we use the Tahoe Lock Files technique, then the lock file L1
is exposed to this lack of safety -- an unlucky pattern of failures
might cause both the current value of L1 and the new value that you
are attempting to write to be lost. However, this is no big deal!
The only thing that is lost is the lock string. The precious data
over in M1 remains safe from the (small) danger cause by
uncoordinated writes.
Regards,
Zooko
More information about the tahoe-dev
mailing list