[tahoe-dev] [tahoe-lafs] #1212: Repairing fails if less than 7 servers available
tahoe-lafs
trac at tahoe-lafs.org
Thu Sep 30 01:17:35 UTC 2010
#1212: Repairing fails if less than 7 servers available
------------------------------+---------------------------------------------
Reporter: eurekafag | Owner:
Type: defect | Status: closed
Priority: major | Milestone: soon
Component: code-network | Version: 1.8.0
Resolution: fixed | Keywords: reviewed
Launchpad Bug: |
------------------------------+---------------------------------------------
Comment (by zooko):
Replying to [comment:12 kevan]:
> If we do that, we lose the property that the repairer will always try to
place whichever shares are missing onto *some* storage servers, even if
the end result isn't optimally distributed.
Doesn't this mean that {{{H}}} is effectively {{{0}}} for you when you are
doing this?
> I can also make my node's repair go for broke with share regeneration by
changing the value of happiness in {{{tahoe.cfg}}} to be 0. This is a
chore, but it means that people who really want the repairer to try to
place new shares regardless of where can still get that behavior.
Right. If you want this behavior, set {{{H==0}}}. If you want the other
behavior (abort the repair) set {{{H}}} to something else. With the v1.7.1
behavior and the current trunk behavior (since
20100927200102-b8d28-9111a341188a4264e5070f91b52364a2addcb3dc), setting
{{{H}}} in your {{{tahoe.cfg}}} has no effect on repairer
behavior—repairer always acts as though {{{H==0}}}.
> Maybe the best approach is to fix #614 with this in mind. The repairer
could regenerate and try to place all of the missing shares, as it does
now, but also tell the caller (in the post repair results) whether the
repair was ultimately successful or not based on how the shares are
distributed, using the client's configured happiness value for that check.
Oh, good catch. Yes, if we fix #614 then repairer would be using {{{H}}}
(during the check/verify step) to determine whether or not to trigger a
repair. Once it triggered the repairer, then it could ''also'' use {{{H}}}
to determine whether to abort the repair, or it could instead treat
{{{H}}} as effectively {{{0}}} for the purpose of the repair.
Now that I've thought about it more and read your comments, Kevan, I think
I agree that we should have the latter behavior, as long as we fix #614 so
that the output reported by the repairer can be easily understood by the
user as indicating "unhealthy" when the servers of happiness is less than
{{{H}}}.
Oh, in fact, what I ''really'' want is for repairer to ''proceed'' and to
do its best even if it knows that it can't reach servers of happiness
greater than or equal to {{{H}}} (instead of aborting the way uploader
does), but then to return a failure result saying that it wasn't able to
repair the file back to health.
Does that make sense?
Okay, I'm done changing my mind for the moment. What do you think?
> Edit: I didn't read Zooko's comment closely enough. Is what I describe
in the third paragraph what the repairer already does? If so, what don't
you like about that?
Sorry: I don't understand this question. Hopefully I answered it above.
--
Ticket URL: <http://tahoe-lafs.org/trac/tahoe-lafs/ticket/1212#comment:13>
tahoe-lafs <http://tahoe-lafs.org>
secure decentralized storage
More information about the tahoe-dev
mailing list