#232 new defect

Peer selection doesn't rebalance shares on overwrite of mutable file.

Reported by: zooko Owned by: warner
Priority: major Milestone: soon
Component: code-mutable Version: 0.7.0
Keywords: repair preservation Cc:
Launchpad Bug:


When you upload a new version of a mutable file, it currently uploads the new shares to peers which already have old shares, then checks that enough shares have been uploaded, then is happy. However, this means it never "rebalances", so if there were few peers (or just yourself!) the first time, and many peers the second time, the file is still stored on only those few peers.

This is an instance of the general principle that shares are not the right units for robustness measurements -- servers are.

Change History (23)

comment:1 Changed at 2007-12-13T00:18:58Z by zooko

  • Milestone changed from 0.7.0 to 0.7.1
  • Owner changed from warner to zooko
  • Status changed from new to assigned
  • Summary changed from peer selection doesn't rebalance shares to peer selection doesn't rebalance shares on overwrite of mutable file

Actually I'm going to bump this out of the v0.7.0 Milestone and instead document that you have to have as many servers as your "total shares" parameter if you want robust storage. As mentioned in http://allmydata.org/trac/tahoe/ticket/115#comment:10 , the WUI should be enhanced to indicate the status of the creation of the private directory this to the user.

comment:2 Changed at 2007-12-17T23:48:53Z by zooko

This is related to ticket #213 -- "good handling of small numbers of servers, or strange choice of servers".

comment:3 Changed at 2007-12-19T22:49:54Z by zooko

  • Summary changed from peer selection doesn't rebalance shares on overwrite of mutable file to Peer selection doesn't rebalance shares on overwrite of mutable file.

comment:4 Changed at 2008-01-05T04:04:38Z by warner

  • Milestone changed from 0.7.1 to 0.8.0

This is an instance of the general principle that shares are not the right units for robustness measurements -- servers are.

Oh, I think it's actually more complicated than that. When we decide to take the plunge, our peer selection algorithm should be aware of the chassis, rack, and colo of each storage server. It should start by putting shares in different colos. If it is forced to put two shares in the same colo, it should try to put them in different racks. If they must share a rack, get them in different chassis. If they must share a chassis, put them on different disks. Only when all other options are exhausted, then two shares can be put on the same disk (but we shouldn't be happy about it).

For now, in small grids, getting the shares onto different nodes is a good start.

When a mutable file is modified, it's fairly easy to detect an improvement that could be made and move shares to new servers. Another desireable feature would be for the addition of a new server to automatically kick off a wave of rebalancing. We have to decide upon how we want to trigger that, though: the most naive approach (sweep through all files and check/repair/rebalance each one every month) will have a certain bandwidth/diskio cost that might be excessive and/or starve normal traffic.

I'm moving this to the 0.8.0 milestone since it matches the 0.8.0 goals. There are a couple of different levels of support we might provide, so once we come up with a plan, we might want to make a couple of new tickets and schedule them differently.

comment:5 Changed at 2008-01-05T19:06:36Z by zooko

One more wrinkle is that if N/(K+1) is large enough (>= 2, perhaps), then perhaps it should put K+1 shares into the same co-lo in order to enable regeneration of shares using only in-co-lo bandwidth.

comment:6 Changed at 2008-01-09T01:10:14Z by warner

  • Milestone changed from 0.8.0 (Allmydata 3.0 Beta) to 0.10.0

comment:7 Changed at 2008-05-12T19:51:16Z by zooko

  • Owner changed from zooko to warner
  • Status changed from assigned to new

Brian: did you leave this behavior unchanged in the recent mutable-file upload/download refactoring?

comment:8 Changed at 2008-05-12T20:15:44Z by warner

Yes, this behavior is unchanged, and this ticket remains open. The publish process will seek to update the shares in-place, and will only look for new homes for shares that cannot be found.

To get automatic rebalancing, the publish process (specifically Publish.update_goal) needs to count how many shares are present on each server, and gently try to find a new home for them if there is more than one. ("gentle" in the sense that it should leave the share where it is if there are not extra empty servers to be found). In addition, we need to consider deleting the old share rather than merely creating a new copy of it.

comment:9 Changed at 2008-05-29T22:20:44Z by warner

  • Milestone changed from 1.1.0 to 1.2.0

comment:10 Changed at 2008-06-10T23:03:57Z by warner

One additional thing to consider when working on this: if the mutable share lives on a server which is now full, the client should have the option of removing the share from that server (so it can go to a not-yet-full one). This can get tricky.

The first thing we need is a storage-server API to cancel leases on mutable shares, then code to delete the share when the lease count goes to zero. A mutable file that has multiple leases on it will be particularly tricky to consider.

comment:11 Changed at 2009-06-19T18:46:42Z by warner

  • Milestone changed from 1.5.0 to 1.6.0

comment:12 Changed at 2009-08-10T15:28:49Z by zooko

The following clump of tickets might be of interest to people who are interested in this ticket: #711 (repair to different levels of M), #699 (optionally rebalance during repair or upload), #543 ('rebalancing manager'), #232 (Peer selection doesn't rebalance shares on overwrite of mutable file.), #678 (converge same file, same K, different M), #610 (upload should take better advantage of existing shares), #573 (Allow client to control which storage servers receive shares).

comment:13 Changed at 2009-08-10T15:45:41Z by zooko

Also related: #778 ("shares of happiness" is the wrong measure; "servers of happiness" is better).

comment:14 Changed at 2009-10-28T07:13:48Z by davidsarah

  • Keywords reliability added

comment:15 Changed at 2009-10-28T07:33:39Z by davidsarah

  • Keywords integrity added

comment:16 Changed at 2009-10-28T07:50:39Z by davidsarah

  • Keywords integrity removed

Sorry, not integrity, only reliability.

comment:17 Changed at 2009-12-22T18:47:22Z by davidsarah

  • Keywords repair preservation added; reliability removed

comment:18 Changed at 2010-01-10T07:21:58Z by warner

  • Component changed from code-peerselection to code-mutable

moving this to category=mutable, since it's more of an issue with the mutable publish code than with the general category of peer selection

comment:19 Changed at 2010-01-26T15:41:14Z by zooko

  • Milestone changed from 1.6.0 to eventually

comment:20 Changed at 2010-05-26T14:49:09Z by zooko

  • Milestone changed from eventually to 1.8.0

It's really bothering me that mutable file upload and download behavior is so finicky, buggy, inefficient, hard to understand, different from immutable file upload and download behavior, etc. So I'm putting a bunch of tickets into the "1.8" Milestone. I am not, however, at this time, volunteering to work on these tickets, so it might be a mistake to put them into the 1.8 Milestone, but I really hope that someone else will volunteer or that I will decide to do it myself. :-)

comment:21 Changed at 2010-07-24T05:39:53Z by zooko

  • Milestone changed from 1.8.0 to soon

It was a mistake to put this ticket into the 1.8 Milestone. :-)

comment:22 Changed at 2012-12-06T21:43:36Z by davidsarah

  • Milestone changed from soon to 1.11.0

Related to #1057 (Alter mutable files to use servers of happiness). Ideally the server selection for mutable and immutable files would use the same code, as far as possible.

comment:23 Changed at 2012-12-06T22:36:17Z by davidsarah

See also #1816: ideally, only the shares that are still needed for the new version should have their leases renewed.

Note: See TracTickets for help on using tickets.