[tahoe-lafs-trac-stream] [tahoe-lafs] #1640: the mutable publisher should try harder to place all shares

tahoe-lafs trac at tahoe-lafs.org
Sat Dec 17 22:32:45 UTC 2011


#1640: the mutable publisher should try harder to place all shares
---------------------+---------------------------
 Reporter:  kevan    |          Owner:  nobody
     Type:  defect   |         Status:  new
 Priority:  major    |      Milestone:  undecided
Component:  unknown  |        Version:  1.9.0
 Keywords:           |  Launchpad Bug:
---------------------+---------------------------
 If a connection error is encountered while pushing a share to a storage
 server, the mutable publisher forgets about the writer object associated
 with the (share, server) placement; this is consistent with the pre-1.9
 publisher, and, in high level terms, means that the publisher views that
 share placement as probably invalid, associating the error with a server
 failure or something like it. The pre-1.9 publisher attempts to find
 another home for the share placed on the broken server. The current
 publisher doesn't.

 When I first wrote the publisher, I wanted to support streaming upload of
 mutable files. That made it hard to find a new home for a share placed on
 a broken storage server, since we wouldn't necessarily have all of the
 parts of the share we generated and placed before the failure available to
 upload to a new server. We ended up ditching streaming uploads due to
 other concerns; instead, we write a share all at once, and we have
 everything we will write to a storage server available to us when we
 write. Given this, there's no compelling reason that the publisher
 couldn't attempt to find a new home for shares placed on broken servers.
 Ensuring that all shares are placed if at all possible makes it more
 likely that there will be a recoverable version of the mutable file
 available after an update.

 In practical terms, this increases the chance of data loss somewhat,
 proportional to the number of servers that fail during a publish
 operation. If too many storage servers fail during the upload process and
 too much of the initial share placement is lost due to these failures, the
 newly-placed mutable file might not be recoverable. A fix would involve a
 way to change the server associated with a writer after the writer is
 created, and probably some control flow changes to ensure that write
 failures result in shares being reassigned.

-- 
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1640>
tahoe-lafs <https://tahoe-lafs.org>
secure decentralized storage


More information about the tahoe-lafs-trac-stream mailing list