[tahoe-dev] using the volunteer grid (dogfood tasting report, continued)

Wed Apr 8 16:02:09 PDT 2009

On Wed, Apr 08, 2009 at 04:04:34PM -0600, zooko wrote:
> So now that I have connected to the volunteer grid and uploaded  
> some .flac files, I decided to try playing them by streaming them  
> directly from the grid.
This is in my opinion one of the shortcomings of tahoe, that it does not yet do
any caching. Maybe this can be added in future releases.

> 3       hfhjhv7sop73qzjisoq5ksmhsf4fj47w (trid0)
> 
> Okay, so the combination of francois, ndurner, secorp, and trid put  
> together gave me 74 KB/sec.  Perhaps one of those four can upload at  
> most 18.5 KB/sec (one fourth of 74 KB/sec), and that one's upstream  
> bandwidth is the limiting factor in this.
I have to admit that my node has an upstream bandwith of only ~60 KiloByte/s
total, which is used also by other applications. Furthermore I will have to
start shaping traffic during the day, as my brother likes to play World of
Warcraft et al without lag.

> Then I thought: Hm, wait a second, I have 13-of-16 encoding here,  
> which means the file can survive the loss of any 3 shares.  But,  
> several of those servers are holding more than one share.  The loss  
> of both trid0 and SECORP_DOT_NET_01, for example, would take away 4  
> shares and leave only 12 -- not enough to recover my file.  (This is  
> quite apart from the issue that I mentioned in my previous dogfood  
> tasting report -- that I would like to store an equal amount of  
> shares with each *operator* rather than with each *machine*.)
I think this concept should be even wider. Imagine a configuration directive
called "ReliabilityGroup", which holds an array of strings. The more elements
the array has in common with the elements of the ReliabilityGroup array of
another node, the more likely they are to fail together. Example:

Node A
    ReliabilityGroup ["Europe", "Germany", "Berlin", "Jon Doe", "server1"]
Node B
    ReliabilityGroup ["Europe", "Germany", "Munich", "Some Person", "server1"]
Node C
    ReliabilityGroup ["America", "USA", "Texas", "somebody else", "my big storage rig"]

The common subset ["Europe", "Germany"] means that Node A and Node B are more
likey to fail at the same time than Node B and Node C. When a user wants to
place to shares, logically Node C and one of Node A and B should be selected.
This would even allow creating ReliabilityGroups down to the individual
harddisk used for that node. Each level could result in another array element
(e.g. state -> city -> data center -> floor -> room -> rack -> machine -> disk
number).
Last but not least, this could also help for downloading files. Instead of
picking the first X nodes (where X is the number of required shares to
reassemble the file) it will pick the first X nodes that are closest based on
the reliability group.

Cheers,
David
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: Digital signature
Url : http://allmydata.org/pipermail/tahoe-dev/attachments/20090409/121c054b/attachment.pgp