#16 closed defect

move to Tahoe2 peer selection algorithm — at Version 11

Reported by: zooko Owned by: warner
Priority: major Milestone: 0.6.0
Component: code-peerselection Version: 0.5.0
Keywords: Cc:
Launchpad Bug:

Description (last modified by warner)

Implement the TahoeTwo peer-selection algorithm for share upload. This is the first-N-peers-in-the-permuted-list approach, as opposed to the TahoeThree "shares in baskets" scheme that's present in 0.5.0. TahoeThree can result severe non-uniform distribution of shares (in some cases, all shares being given to the same node), which hurts reliability, and is worse in small networks.

The share download is still ask-everybody, which will work up to maybe 100 peers. Eventually we'll move that to TahoeTwo as well.

In the longer run, we'll need to move to DenverAirport?, to get beyond a few hundred peers, but we're thinking post-0.6.0 for that.

Change History (11)

comment:1 Changed at 2007-04-28T19:17:26Z by warner

  • Component changed from component1 to documentation

comment:2 Changed at 2007-05-16T15:20:58Z by zooko

  • Status changed from new to assigned

At this time, I'm not able to remember the reasons either. I distinctly recall them seeming like good reasons at the time. ;-)

I'll try to remember.

comment:3 Changed at 2007-05-24T01:25:12Z by warner

it occurred to me, while pondering the consequences of varying our encoding parameters, that the PeerSelection/TahoeTwo? first-100-peers approach works very well if you know the verifierid but not the number of shares you need to get (you just ask successive peers until someone has some data and ask them for the encoding parameters, then you know how many more peers you need to talk to).

On the other hand, the PeerSelection/TahoeThree? (100-uniformly-distributed-clock-hands) approach fails very badly in this case. (at least once the mesh has significantly more than 100 peers).

comment:4 Changed at 2007-05-24T02:55:35Z by zooko

Oh I see. You mean if you know the verifierid but not the encoding parameters. Interesting! It's not such a bad failure, though, if you think of it as just enlarging the URI a bit (by the addition of encoding parameters).

Which reminds me that in zfec I figured out a way to compress the parameters into a mere 2, 3, or 4 bytes...

comment:5 Changed at 2007-05-24T18:07:15Z by warner

well, what I'm thinking is that the encoding parameters might change over time (perhaps because we decide that we don't need so much redundancy, perhaps because we switch to a new encoding mechanism). Having the encoding parameters in the URI means that we retain forwards-compatibility with these changes (depending upon how we compress/derive them, of course), but having them in the "share index" means that future people who encode the file one way won't discover the alternate encoding, either during upload (when they might refrain from making the new encoding because the old one is good enough), or during download (when they might find more shares from an alternate encoding than the one that they remember uploading).

It's kind of in the space of convergent encoding, I suppose: the mapping between crypttext (i.e. verifierid) and sets of shares. A one-to-one mapping is the most efficient (there will only be one encoded version of any given file), but one-to-many is likely to occur if we want to take advantage of our flexibility in choosing the encoding parameters.

comment:6 Changed at 2007-05-24T19:14:21Z by zooko

Good points.

comment:7 Changed at 2007-06-04T22:17:54Z by zooko

  • Owner changed from zooko to warner
  • Status changed from assigned to new

I think perhaps Brian has done this task by updating the PeerSelection and PeerSelection/TahoeThree? pages?

comment:8 Changed at 2007-06-05T02:54:20Z by warner

  • Owner changed from warner to zooko

Nope.. we still need a description of why we implemented tahoe3 instead of tahoe2. I've described the differences between them on those pages, but I can't remember our motivations that week, and I'm starting to believe that tahoe2 is a better choice than tahoe3. So I'm looking for history and explanation of our decision, as I can't come up with anything.

If zooko can't remember our reasons, then we should sit down and have a new talk about tahoe2 vs tahoe3.

comment:9 Changed at 2007-07-08T08:40:36Z by warner

  • Priority changed from minor to major

I'm finding more reasons to prefer TahoeTwo over TahoeThree: for small meshes, the random distribution of shares to nodes is very non-uniform in TahoeThree, which means that one node winds up with a lot more of the shares than others. In our three-node testnet, I'm seeing some files drop 90 (out of 100) shares on a single host. That blows our failure probabilities completely: some files will permute in such a way that a single lost node will lose the entire file.

I think we might have picked TahoeThree because we were working on implementation and wanted behavior that was easy to understand on both large meshes and small ones. TahoeThree has the advantage that you can usually do your bucket-allocation request in a single message per peer, which improves network efficiency (and latency) considerably for small meshes. In TahoeTwo you basically have to send out a separate message for each share, and you need some extra logic to handle small meshes where you have to swing around to the same peer multiple times.

But I'm starting to think that the uniform-distribution properties of TahoeTwo are very important. Perhaps we can take some shortcuts in TahoeTwo to avoid the network overhead: say, we start by offering one share per node, but then when we realize we've looped around and still have 97 shares to go, we divvy them up evenly among the peers who are still in the running, and the next time we ask one, we try to allocate a whole bunch of shares at once. If they refuse some, we equalize out the requests among all the remaining nodes.

I'm raising this one to 'major', because the potential for severe-non-uniformity means we'll fail to meet our relability goals for one of our significant use cases (small meshes among friends).

comment:10 Changed at 2007-08-20T19:18:56Z by warner

  • Component changed from documentation to code-peerselection
  • Milestone set to 1.0
  • Summary changed from explain/justify our current Peer Selection to consider moving to Tahoe2 peer selection algorithm
  • Version set to 0.5.0

comment:11 Changed at 2007-08-20T20:45:10Z by warner

  • Description modified (diff)
  • Milestone changed from 1.0 to 0.6.0
  • Owner changed from zooko to warner
  • Status changed from new to assigned
  • Summary changed from consider moving to Tahoe2 peer selection algorithm to move to Tahoe2 peer selection algorithm

in chatting with zooko, I've decided to add this to the list of stuff to do for 0.6, because the non-uniformity problem is important in my mind. Beyond 0.6, we probably need to sit down and work on DEN, because both TahoeTwo and TahoeThree require a fully-connected mesh, and I'd be leery of using that beyond a few hundred nodes.

Note: See TracTickets for help on using tickets.