[tahoe-dev] [tahoe-lafs] #302: stop permuting peerlist, use SI as offset into ring instead?
David-Sarah Hopwood
david-sarah at jacaranda.org
Sat Dec 26 20:39:15 PST 2009
Jody Harris wrote:
> As a user/administrator of an ad-hoc Tahoe-lafs, there are several
> assumptions that are not valid for our use case:
> - All nodes do not start out with the same size storage space
The simulations looked at that case for simplicity, but Tahoe-LAFS doesn't
assume it. Especially when #778 is fixed (scheduled for the next release,
1.6), a non-uniform distribution of storage space should be tolerated
fairly well, *provided* that the range of capacities is not too large.
However, "4 GB to 2 TB" is probably asking too much: the servers with
4 GB will inevitably receive a lot of requests to store shares that they
don't have room for, and to download shares that they haven't stored.
I've submitted ticket #872 about that.
> - Almost all nodes will be used to store data other than Tahoe shares
This has a similar effect to starting with non-uniform storage sizes.
Note that each node has a reserved_space setting (see
<http://allmydata.org/trac/tahoe/browser/docs/configuration.txt#L282>),
and will stop accepting shares when less than that much space is left.
As of release 1.6 this setting will also work for nodes running Windows
(ticket #637). However, this reservation isn't currently applied to the
download cache if the node is configured as a gateway (ticket #871).
> - Bandwidth balancing will be more important than storage space balancing
Storage balancing and bandwidth balancing are closely related: if a
server holds some fraction f of the shares, then it will be expected
to receive roughly a fraction f of the share requests, on average,
provided that the distribution of frequencies with which shares are
accessed is not too heavily skewed.
Note that if a server has more storage relative to other servers, then
it is probably a more recent machine and so might be able to handle a
higher bandwidth. In the longer term it might be possible to measure the
bandwidth and use that as an input to share rebalancing (i.e. if a server
is bandwidth-constrained, move its most frequently accessed shares to
servers with more available bandwidth), but that's complicated, and
dependent on ticket #543. Again, the current algorithm will only do a
reasonable job of bandwidth balancing under the assumption that the
range of capacities is not too large.
> - Nodes will be expected to join, leave and fail at random
There are currently some significant limitations here, that will require
fixing tickets #521 and/or #287.
----
http://allmydata.org/trac/tahoe/ticket/287
'download/upload: tolerate lost or missing servers'
http://allmydata.org/trac/tahoe/ticket/521
'disconnect unresponsive servers (using foolscap's disconnectTimeout)'
http://allmydata.org/trac/tahoe/ticket/543
'rebalancing manager'
http://allmydata.org/trac/tahoe/ticket/637
'support "keep this much disk space free" on Windows as well as other
platforms'
http://allmydata.org/trac/tahoe/ticket/778
'"shares of happiness" is the wrong measure; "servers of happiness" is
better'
http://allmydata.org/trac/tahoe/ticket/871
'handle out-of-disk-space condition'
http://allmydata.org/trac/tahoe/ticket/872
'Adjust the probability of selecting a node according to its storage
capacity'
--
David-Sarah Hopwood ⚥ http://davidsarah.livejournal.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 292 bytes
Desc: OpenPGP digital signature
Url : http://allmydata.org/pipermail/tahoe-dev/attachments/20091227/93fe21a4/attachment.pgp
More information about the tahoe-dev
mailing list