[tahoe-dev] [tahoe-lafs] #302: stop permuting peerlist, use SI as offset into ring instead?

Sat Dec 26 20:39:15 PST 2009

Jody Harris wrote:
> As a user/administrator of an ad-hoc Tahoe-lafs, there are several
> assumptions that are not valid for our use case:
>  - All nodes do not start out with the same size storage space

The simulations looked at that case for simplicity, but Tahoe-LAFS doesn't
assume it. Especially when #778 is fixed (scheduled for the next release,
1.6), a non-uniform distribution of storage space should be tolerated
fairly well, *provided* that the range of capacities is not too large.

However, "4 GB to 2 TB" is probably asking too much: the servers with
4 GB will inevitably receive a lot of requests to store shares that they
don't have room for, and to download shares that they haven't stored.
I've submitted ticket #872 about that.

>  - Almost all nodes will be used to store data other than Tahoe shares

This has a similar effect to starting with non-uniform storage sizes.
Note that each node has a reserved_space setting (see
<http://allmydata.org/trac/tahoe/browser/docs/configuration.txt#L282>),
and will stop accepting shares when less than that much space is left.
As of release 1.6 this setting will also work for nodes running Windows
(ticket #637). However, this reservation isn't currently applied to the
download cache if the node is configured as a gateway (ticket #871).

>  - Bandwidth balancing will be more important than storage space balancing

Storage balancing and bandwidth balancing are closely related: if a
server holds some fraction f of the shares, then it will be expected
to receive roughly a fraction f of the share requests, on average,
provided that the distribution of frequencies with which shares are
accessed is not too heavily skewed.

Note that if a server has more storage relative to other servers, then
it is probably a more recent machine and so might be able to handle a
higher bandwidth. In the longer term it might be possible to measure the
bandwidth and use that as an input to share rebalancing (i.e. if a server
is bandwidth-constrained, move its most frequently accessed shares to
servers with more available bandwidth), but that's complicated, and
dependent on ticket #543. Again, the current algorithm will only do a
reasonable job of bandwidth balancing under the assumption that the
range of capacities is not too large.

>  - Nodes will be expected to join, leave and fail at random

There are currently some significant limitations here, that will require
fixing tickets #521 and/or #287.

----
http://allmydata.org/trac/tahoe/ticket/287
'download/upload: tolerate lost or missing servers'

http://allmydata.org/trac/tahoe/ticket/521
'disconnect unresponsive servers (using foolscap's disconnectTimeout)'

http://allmydata.org/trac/tahoe/ticket/543
'rebalancing manager'

http://allmydata.org/trac/tahoe/ticket/637
'support "keep this much disk space free" on Windows as well as other
 platforms'

http://allmydata.org/trac/tahoe/ticket/778
'"shares of happiness" is the wrong measure; "servers of happiness" is
 better'

http://allmydata.org/trac/tahoe/ticket/871
'handle out-of-disk-space condition'

http://allmydata.org/trac/tahoe/ticket/872
'Adjust the probability of selecting a node according to its storage
 capacity'

-- 
David-Sarah Hopwood  ⚥  http://davidsarah.livejournal.com

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 292 bytes
Desc: OpenPGP digital signature
Url : http://allmydata.org/pipermail/tahoe-dev/attachments/20091227/93fe21a4/attachment.pgp