[tahoe-dev] BackupDB proposal

Aleksandr Milewski zandr at allmydata.com
Thu May 29 09:34:10 PDT 2008


On May 29, 2008, at 01:45 , Ben Laurie wrote:
>
> Rather than messing around with a database, I would store hashes
> alongside each file and check whether the hash has changed. Obviously
> you incur the cost of rehashing the local file each time, but, well,  
> who
> cares?

Users care.

Rehashing the entire filesystem every time you're trying to run an  
incremental backup is obnoxious. It will take a large amount of disk  
IO and CPU time to do even a small backup.

Putting the hashes alongside each file is also obnoxious for a couple  
of reasons, but most importantly because a backup is nominally a read  
operation. Scribbling all over the filesystem you're supposed to be  
backing up is a bad idea.

So, you could create a parallel tree with the file hashes, but if  
you're going to do that, then a database is faster and easier. And,  
FWIW, common practice in backup tools.


More information about the tahoe-dev mailing list