[tahoe-lafs-trac-stream] [tahoe-lafs] #1513: memory usage in MDMF publish
tahoe-lafs
trac at tahoe-lafs.org
Sun Aug 28 15:41:19 PDT 2011
#1513: memory usage in MDMF publish
------------------------------+--------------------------
Reporter: warner | Owner:
Type: defect | Status: new
Priority: minor | Milestone: 1.9.0
Component: code-mutable | Version: 1.9.0a1
Resolution: | Keywords: mutable mdmf
Launchpad Bug: |
------------------------------+--------------------------
Comment (by warner):
Hm, there's a tension between reliability and memory-footprint-performance
here. When making changes, we want each share to atomically jump from
version1 to version2, without it being left in any intermediate state. But
that means all of the changes need to be held in memory and applied at the
same time.
When we're jumping from "no such share" to version1, those changes are the
entire file. The data needs to be buffered *somewhere*. If we were allowed
to write one segment at a time to the server's disk, then a server failure
or lost connection would leave us in an intermediate state, where the
share only had a portion of version1, which would effectively be a corrupt
share.
I can think of a couple of ways to improve this:
* special-case the initial share creation: give the client an API to
incrementally write blocks to the new share, and either allow the world to
see the incomplete share early, or put the partial share in a separate
incoming/ directory and figure out a way to only make it visible to the
client that's building it.
* create an API to build a new version of the share one change at a time,
then a second API call to finalize the change (and make the new version
visible to the world). It might look something like the immutable share-
building API.:
* edithandle = share.start_editing()
* edithandle.apply_delta(offset, newdata)
* edithandle.finish()
* edithandle.abort()
* finish() is the test-and-set operation: it might fail if some other
writer has completed their own start_editing()/apply_delta()/finish()
sequence faster.
If we're willing to tolerate the disk-footprint, we could increase
reliability against server crashes by making start_editing() create a full
copy of the old share in a sibling directory (like incoming/, not visible
to anyone but the edithandle). Then apply_delta() would do normal write()s
to the copy, and finish() would atomically move the copy back into place.
Everything in the incoming/ directory would be deleted at startup, and the
temp copies would also be deleted when the connection to the client was
lost. This would slow down the updates for large files (since a lot of
data would need to be shuffled around before the edit could begin), and
would consume more disk (twice the size of the share), but would allow
edits to be spread across separate messages, which reduces the client's
memory requirements. It would also reduce share corruption caused by the
server being bounced during a mutable write.
--
Ticket URL: <http://tahoe-lafs.org/trac/tahoe-lafs/ticket/1513#comment:1>
tahoe-lafs <http://tahoe-lafs.org>
secure decentralized storage
More information about the tahoe-lafs-trac-stream
mailing list