[tahoe-dev] Keeping local file system and Tahoe store in sync
Shawn Willden
shawn-tahoe at willden.org
Mon Feb 2 20:46:26 PST 2009
On Monday 02 February 2009 08:58:42 pm zooko wrote:
> Brian has been posting about this on the issue tracker, e.g.:
>
> http://allmydata.org/trac/tahoe/ticket/598
Thanks. It looks like his approach is sufficiently different from mine that
I'm going to just keep going as I am.
Key differences are:
1. With mine, mirroring the directory structure is optional. Not mirroring
it should make backups somewhat more efficient (and initial backups much more
efficient) because there's no need to create all those dircaps.
2. Forward-difference increments. This should make uploading small changes
to large files very efficient.
3. Backupdb may be stored locally OR in the grid. I haven't gotten far
enough to test it yet, but I think the performance hit for storing it in the
grid should be pretty small. It may be zero in some cases.
4. Backup of metadata in addition to file contents. Permissions, ACLs,
resource forks, etc. My ultimate goal is to be able to do whole-system
backups and restores, so this is essential.
5. Smart handling of hardlinks and symlinks.
6. A focus on the issue of initial, large uploads. A backup session can be
terminated and resumed, and reasonable timestamping of backups is maintained
to facilitate a future "Time Machine"-like view.
7. A general focus on efficiency. Not micro-optimization, but structuring
the backup process to avoid re-scanning, to facilitate streaming uploads, to
minimize creation of mutable nodes (i.e. dirnodes), etc.
I should probably write up another design post, because I've made some major
changes since my initial thoughts, but I think I'll get back to hacking
instead :-)
> I think we should start adding tahoe-dev at allmydata.org to the Cc:
> line on trac tickets that are likely to be of interest to readers of
> the list.
That's probably a very good idea, since most of us probably don't follow the
Trac tickets closely.
Shawn.
More information about the tahoe-dev
mailing list