[tahoe-lafs-trac-stream] [tahoe-lafs] #1640: the mutable publisher should try harder to place all shares
tahoe-lafs
trac at tahoe-lafs.org
Sat Dec 17 22:32:45 UTC 2011
#1640: the mutable publisher should try harder to place all shares
---------------------+---------------------------
Reporter: kevan | Owner: nobody
Type: defect | Status: new
Priority: major | Milestone: undecided
Component: unknown | Version: 1.9.0
Keywords: | Launchpad Bug:
---------------------+---------------------------
If a connection error is encountered while pushing a share to a storage
server, the mutable publisher forgets about the writer object associated
with the (share, server) placement; this is consistent with the pre-1.9
publisher, and, in high level terms, means that the publisher views that
share placement as probably invalid, associating the error with a server
failure or something like it. The pre-1.9 publisher attempts to find
another home for the share placed on the broken server. The current
publisher doesn't.
When I first wrote the publisher, I wanted to support streaming upload of
mutable files. That made it hard to find a new home for a share placed on
a broken storage server, since we wouldn't necessarily have all of the
parts of the share we generated and placed before the failure available to
upload to a new server. We ended up ditching streaming uploads due to
other concerns; instead, we write a share all at once, and we have
everything we will write to a storage server available to us when we
write. Given this, there's no compelling reason that the publisher
couldn't attempt to find a new home for shares placed on broken servers.
Ensuring that all shares are placed if at all possible makes it more
likely that there will be a recoverable version of the mutable file
available after an update.
In practical terms, this increases the chance of data loss somewhat,
proportional to the number of servers that fail during a publish
operation. If too many storage servers fail during the upload process and
too much of the initial share placement is lost due to these failures, the
newly-placed mutable file might not be recoverable. A fix would involve a
way to change the server associated with a writer after the writer is
created, and probably some control flow changes to ensure that write
failures result in shares being reassigned.
--
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1640>
tahoe-lafs <https://tahoe-lafs.org>
secure decentralized storage
More information about the tahoe-lafs-trac-stream
mailing list