[tahoe-dev] Proximity Aware Decoding

Nathan Eisenberg nathan at atlasnetworks.us
Tue Dec 8 01:00:23 PST 2009

Tahoe Devs,

Is there currently any mechanism, or any plans to implement a mechanism, which allows for storage nodes to be 'arranged' with nearby gateway/WAPI nodes, allowing for a more logical scaleout in a formal datacenter environment?  To borrow from Hadoop terminology, this would be called 'Rack Awareness' (http://hadoop.apache.org/common/docs/r0.17.2/hdfs_user_guide.html#Rack+Awareness).

For example, say that I have three facilities, and I wish to setup 3-4 nodes in each of them in a 3-of-10 scheme:

Facility 1 (ISDN):
Gateway 1
Node 1A
Node 1B
Node 1C

Facility 2 (ISDN):
Gateway 2
Node 2A
Node 2B
Node 2C
Node 2D

Facility 3 (56K):
Gateway 3
Node 3A
Node 3B
Node 3C

For arguments sake, let us say that we have a very expensive, limited connection between these facilities (to make this extreme, let's call it a dialup-ish connection - obviously this is an exaggeration, but the argument scales up).

If gateway 1 attempts to retrieve a file, it is obviously most efficient for it to do so utilizing nodes 1ABC.  In the event that 1C is down, any of the other shares can obviously step in, at a greater cost to the infrastructure.  Ideally, you would also be able to weight the next best share - if facilities 1 and 2 have ISDN lines, and 3 has a dialup line, it is preferable for a gateway at facility 1 to query a node at facility 2.  If no weighting is configured, or in a distributed friendnet, other possible methods could be distance vector routing (how many hops to the other nodes with shares), latency, or Geo-IP lookups.

Best Regards,
Nathan Eisenberg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://allmydata.org/pipermail/tahoe-dev/attachments/20091208/863b9966/attachment.htm 

More information about the tahoe-dev mailing list