[tahoe-lafs-trac-stream] [tahoe-lafs] #1209: repair of mutable files/directories should not increment the sequence number
tahoe-lafs
trac at tahoe-lafs.org
Tue Nov 20 01:43:00 UTC 2012
#1209: repair of mutable files/directories should not increment the sequence
number
-------------------------+-------------------------------------------------
Reporter: gdt | Owner: davidsarah
Type: defect | Status: assigned
Priority: major | Milestone: 1.11.0
Component: code- | Version: 1.8.0
mutable | Keywords: repair mutable preservation space-
Resolution: | efficiency
Launchpad Bug: |
-------------------------+-------------------------------------------------
Old description:
> Particularly with my root directory, I often find that 9 shares of seqN
> are available compared to 10 desired. I do a repair, and this results in
> 10 shares of seqN+1 and then 9 are deleted. Then the next day there are 9
> of seqN+1 and 1 of seqN, and the file is again not healthy. This repeats
> daily.
>
> It seems that the missing seqN shares should be generated and placed, and
> then when servers churn more it's likely that 10 can still be found, and
> no unrecoverable versions. Perhaps I don't get something, but the current
> behavior is not stable with intermittent servers.
>
> I have observed this problem with directories, but it seems likely that
> it applies to all ~~im~~mutable files.
New description:
Particularly with my root directory, I often find that 9 shares of seqN
are available compared to 10 desired. I do a repair, and this results in
10 shares of seqN+1 and then 9 are deleted. Then the next day there are 9
of seqN+1 and 1 of seqN, and the file is again not healthy. This repeats
daily.
It seems that the missing seqN shares should be generated and placed, and
then when servers churn more it's likely that 10 can still be found, and
no unrecoverable versions. Perhaps I don't get something, but the current
behavior is not stable with intermittent servers.
I have observed this problem with directories, but it seems likely that it
applies to all ~~im~~mutable files.
--
Comment (by gdt):
I think davidsarah's proposed algorithm is a good choice. A few comments:
* if there are shares of a version Q < R, then S = R, not Q. This
follows from the algorithm, but in a design doc perhaps should be made
more obvious: stray shares of a version less than the highest recoverable
version are not a problem.
* In the case where R is repaired, stray shares of a lower version
should be removed.
* in the case where S+1 is uploaded, shares of R, and actually shares of
<=S should be removed.
* if R is recoverable and there are shares of S > R, then it's really
not clear what should happen. One possibility is to wait for a while
(days?), keeping checking, and hoping there are enough S. But this is
probably a very unlikely, and it's unclear what ought to happen, so I
would defer that nuance to later.
--
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1209#comment:18>
tahoe-lafs <https://tahoe-lafs.org>
secure decentralized storage
More information about the tahoe-lafs-trac-stream
mailing list