#213 closed enhancement (duplicate)

good handling of small numbers of servers, or strange choice of servers

Reported by: zooko Owned by:
Priority: major Milestone: undecided
Component: code-peerselection Version: 1.6.0
Keywords: usability availability preservation repair Cc: ussjoin@…
Launchpad Bug:

Description (last modified by warner)

Suppose you try to upload something when you are on an airplane and you are completely disconnected from all of your servers other than the one that you yourself are running.

option 1. fail

option 2. silently upload all M shares to yourself

option 3. be transparent about this -- have output showing the user what is happening and a knob the user can use to control to what degree you can rely on yourself alone to store things

option 4. have a "rebalancing" operation in which data which is stored on a "skewed" set of servers (such as too few servers, or on servers which are less well places on the unit circle) gets moved to a more appropriate set

option 5. be transparent about that, too

Change History (9)

comment:1 Changed at 2007-12-17T23:49:12Z by zooko

See also ticket #232 -- "peer selection doesn't rebalance shares on overwrite of mutable file".

comment:2 Changed at 2007-12-18T02:03:22Z by warner

  • Description modified (diff)

comment:3 Changed at 2007-12-18T02:08:21Z by warner

I think that silent rebalancing is going to be an important user-friendly feature. Part of the repairer's job will be to make sure the shares are distributed across a healthy set of peers, since that falls under the title of "improving the health of the file".

Providing an interface that lets the user see where their file got put is good and useful, but I don't want users to be obligated to use or pay attention to it: the abstraction of "the grid is a big place where my files go" is a valuable one, and forcing abstraction-boundary breaks adds to the user's cognitive load.

Perhaps the upload button should have a flag next to it that says "Warning: we're only connected to N peers right now, so you won't get the reliability you might expect: please consider waiting until you have more peers available" might help.

I think the general principle here is that we've (well, as least *I*'ve) been designing tahoe with a static set of peers in mind: the membership of the grid changes slowly over time. Uploading a file while you're on an airplane and then connecting to a larger grid violates this expectation.

comment:4 Changed at 2007-12-18T04:58:57Z by zooko

As you may know, I question the value of the unbroken abstraction of "the grid is a big place where files go". I question it specifically because the cost of making it an unbroken abstraction seems high and potentially very high. On the other hand, it seems quite useful as a partial abstraction. "The grid is a big place where files go, except when it isn't for one of the following reasons..."

We don't have to agree right now on how valuable this abstraction is -- let's just agree to keep an open mind about these issues. Certainly for the two use cases that we have in mind -- the managed proprietary grid operated by sysadmins, and the friendnet -- the user (who is the sysadmin in the former case, I think), is expected to understand and monitor the state of the set of peers during normal usage.

If you mean that you aren't supposed to upload a file while you are on an airplane, and then later connect to a larger grid, because you understand that the set of servers you will be uploading to when you are on the airplane is too small, then I agree.

If you mean that people shouldn't use tahoe on machines that travel on airplanes, I'm not sure what I think about that. Certainly such portable machines should fit into the friendnet case, right? Also in the managed proprietary grid case, I should think that our semantics ought to specify some safe/useful/communicative behavior in the case that there are few servers.

comment:5 Changed at 2008-06-01T20:39:28Z by warner

  • Milestone changed from eventually to undecided

comment:6 Changed at 2009-04-06T06:53:08Z by warner

  • Component changed from code-network to code-peerselection

comment:7 Changed at 2010-01-07T00:48:42Z by davidsarah

  • Keywords usability availability preservation repair added

comment:8 Changed at 2010-02-23T00:11:12Z by USSJoin

  • Cc ussjoin@… added
  • Version changed from 0.7.0 to 1.6.0

So after some discussions today on my long-term use-case, it would seem that the same functionality set could solve this, #398, and #467. Let me explain:

I control a small number of nodes-- let's say four. I want to be able to tell my uploads that they should always leave four shares on the four nodes *I* own, and send the remaining six to the grid. That way, if I'm offline with *only* my four nodes for company, I can still use my files; similarly, when I go offline, people with access can *also* use my files.

In this case, I might also want to be able to configure the use of helpers etc. on a per-subnet basis; that is, "use the helper unless the node you're pushing to is on my LAN, in which case, it's silly."

Ideally I could also set up a modified rebalancer that says "make four shares and put them on my local grid subset," but that's secondary.

comment:9 Changed at 2010-12-29T14:44:56Z by zooko

  • Resolution set to duplicate
  • Status changed from new to closed

There's some good discussion in this ticket, but I think all of the changes we might make are covered by #778, #398, and #467. I'm closing this one as a duplicate and putting a reference to this one into #398 and #467.

Note: See TracTickets for help on using tickets.