[tahoe-dev] [tahoe-lafs] #1354: compression (e.g. to efficiently store sparse files)
tahoe-lafs
trac at tahoe-lafs.org
Thu Feb 3 10:38:37 PST 2011
#1354: compression (e.g. to efficiently store sparse files)
-------------------------------+--------------------------------------------
Reporter: zooko | Owner:
Type: enhancement | Status: new
Priority: major | Milestone: undecided
Component: code-encoding | Version: 1.8.2
Resolution: | Keywords:
Launchpad Bug: |
-------------------------------+--------------------------------------------
Comment (by warner):
Neat trick!
No, from what I've seen, sparse files are not very common. The only things
that come to mind are coredumps and database files, and I suspect that
most modern (cross-platform compatible) DB files are not sparse anymore.
It shouldn't be too hard to rig up a tool to test that claim:
{{{os.walk}}}, use {{{stat}}} to count the number of blocks, compare it
against {{{st_size/blocksize}}}, if they're too far off you've probably
got a sparse file.
The question of compression is an interesting one. To retain our low
alacrity, we'd want to compress each segment separately, which would then
result in variable-sized segments, so we'd need a new share layout (with a
start-of-each-block table). Compressing the whole file would let us
squeeze it down further, of course, but you can't generally get random-
access that way. There may be some clever trick wherein we might save a
copy of the compressor's internal state between segments to allow both
random access *and* good whole-file compression, but I'd be afraid of the
complexity/fragility of that approach.
--
Ticket URL: <http://tahoe-lafs.org/trac/tahoe-lafs/ticket/1354#comment:1>
tahoe-lafs <http://tahoe-lafs.org>
secure decentralized storage
More information about the tahoe-dev
mailing list