[tahoe-dev] Manual rebalancing in 1.10.0?

Kyle Markley kyle at arbyte.us
Tue Sep 17 15:45:24 UTC 2013


It would be my pleasure.  But I won't have time to do it until the weekend.

It might be faster, and all-around better, to create a unit test that 
exercises the scenario in my original message.  Then my buildbot (which 
has way more free time than I do) can try it for me.

Incidentally, I understand how I created that scenario.  The machine 
that had all the shares is always on, and runs  deep-check --repair 
crons.  My other machines aren't reliably on the grid, so after repeated 
repair operations, the always-on machine tends to get a lot of shares.  
Eventually, it accumulated shares.needed, and then a repair happened 
while it was the only machine on the grid.  Because repair didn't care 
about shares.happy, this machine got all shares.total shares.  Then, 
because an upload cares about shares.happy but wouldn't rebalance, it 
had to fail.

A grid whose nodes don't have similar uptime is surprisingly fragile.  
Failure of that single always-on machine makes the file totally 
unretrievable, definitely not the desired behavior.



On 09/16/13 09:57, Zooko O'Whielacronx wrote:
> Dear Kyle:
>
> Could you try Mark Berger's #1382 patch on your home grid and tell us
> if it fixes the problem?
>
> https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1382# immutable peer
> selection refactoring and enhancements
>
> https://github.com/tahoe-lafs/tahoe-lafs/pull/60
>
> Regards,
>
> Zooko
> _______________________________________________
> tahoe-dev mailing list
> tahoe-dev at tahoe-lafs.org
> https://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev


-- 
Kyle Markley



More information about the tahoe-dev mailing list