[tahoe-dev] Thinking about building a P2P backup system
Shawn Willden
shawn-tahoe at willden.org
Fri Jan 9 22:00:46 PST 2009
On Friday 09 January 2009 01:33:39 pm Brian Warner wrote:
> The 'revlog' format that I'd like to use for our eventual "LDMF" (Large
> Distributed Mutable Files) design uses forward deltas and an append-only
> data structure, but writes out a full copy of the file whenever the amount
> of data you'd need to read to construct the latest version gets bigger than
> twice the size of the file. I think the Mercurial folks picked this 2x
> thing as a reasonable tradeoff between speed and storage.
Is there infrastructure in place for append-only files? That would work much
better for incremental backups than storing the forward deltas as separate
files. Actually, the LDMF design you're talking about would make handling
this in the backup server unnecessary. What's the status of that? Does it
makes sense to get that working first?
One suggestion: If you compute an rsync-style rolling checksum on the client
when you upload the original file, and then again with each delta, you won't
need to download the whole file to compute the next forward delta. All
you'll have to download is the rolling checksum for the most recently
uploaded version (full or delta), which is about 1% of the size of the full
file.
> For our mutable files, this also gives us versioning: we create a read-cap
> that contains both a pointer to the file overall (i.e. the revlog) and a
> secure identifier for the particular revision that we care about. The
> question of how far back to keep the deltas becomes a matter of retention
> policy and GC, with the obvious limitation that the oldest thing in the
> revlog should be a full copy.
It may also be worth thinking about some support for reverse deltas. If you
add a little more work whenever you upload a new full copy, then you can
reduce storage usage by avoiding the retention of multiple full copies.
Download the last full version plus all of the forward deltas and then as you
go along patching the full version to bring it up to date, generate a trail
of reverse deltas. Then upload the new full copy as the head of the revlog
and the series of reverse deltas as the older versions. New versions become
forward deltas.
GC of versions older than the last full copy is a simple matter of removing
them from the revlog. Of course, that breaks the "append only" semantics you
mentioned.
> The format I've been sketching out would allow for efficient insertions and
> deletions of arbitrary byte spans, which would be pretty wild. The standard
> POSIX file io API doesn't give any way to express this, but an application
> that's speaking directly to Tahoe could take advantage of it. Imagine being
> able to insert a byte in the beginning of your file in constant time.
Cool.
> The limitation is actually a combination of encryption and erasure-coding
> (which, unless you're using the degenerate k=1 replication case, involves
> splitting the data up among multiple servers). Both make it hard or
> impossible to apply rsync-like algorithms on the server side.
Makes sense. I don't know enough about erasure codes to know if
they "randomize" the data in the pieces to make patching ineffective. It
doesn't surprise me that this is the case, though.
> > If librsync deltas are reversible (need to check)
>
> Incidentally, for deltas to be reversible, they have to contain roughly
> twice the information as an irreversible delta.
That also makes sense. Incidentally, that doubling may be worthwhile for
LMDF. If you decide on an approach that does require downloading and
reassembling version n-1 in order to calculate the delta from n-1 to n, you
might also want to calculate the reverse delta and upload that as well. That
would enable you to drop an old full copy at will, since you'd have the
reverse deltas in place to regenerate it at need.
Shawn.
More information about the tahoe-dev
mailing list