[tahoe-dev] Proximity Aware Decoding

Zooko O'Whielacronx zookog at gmail.com
Wed Dec 9 07:46:21 PST 2009


Oh by the way there is another cool thing you can do if you do have
"proximity awareness": save the bandwidth costs of repair.  Suppose
you have a petabyte worth of hard drives -- e.g. 500 hard drive each
of 2 TB.  Hard drives fail at a rate of around 6% per year [1], so you
would expect about 30 of them to fail per year, which is about one
every two weeks.


  Suppose you have three locations which have very fast and cheap
bandwidth within the location, e.g. the LAN connecting different
servers in the same rack, but slower and more expensive bandwidth
between locations, e.g. if you have to pay a per-bandwidth fee to
transmit between co-lo's.  Then let's call "Q" the number of
locations, and you should set your N value to be (K+1) * Q.  For
example, if K is 3 (three shares needed to reconstruct a file), and Q
is 3 (three locations), then set your N equal to 12 and (once we
implement "proximity awareness"), make sure that there are four shares
in each location.

The advantage of this is that it makes repair cheap.  Suppose you have a

[1] Eduardo Pinheiro, ~Wolf-Dietrich Weber, and Luiz André Barroso
(all of Google Inc.): Failure Trends in a Large Disk Drive Population
http://labs.google.com/papers/disk_failures.pdf


More information about the tahoe-dev mailing list