Shawn,<div><br></div><div>You obviously know more about this than I do.  </div><div><br></div><div>So in summary are you suggesting:</div><div><br></div><div>(S) # of servers: 20</div><div>(H) happiness: 15</div><div>(K) # of file segments: 7<br>

<br></div><div><br></div><div><br><div class="gmail_quote">On Sat, Jan 15, 2011 at 5:10 PM, Shawn Willden <span dir="ltr"><<a href="mailto:shawn@willden.org">shawn@willden.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

It may seem that the reason maintaining high node uptime is important is so that files can be retrieved reliably, i.e. read-availability.  In fact, the bigger hurdle is maintaining write-availability.  This is fairly obvious, since to read you only need K servers and to write you need H servers and usually H is significantly larger than K.<div>


<br></div><div>I think it's even more important than it appears, however, because I think there's value in setting H very close to S (the number of servers in the grid).  If S=20 and H=18, then clearly it's crucial that availability of individual servers be very high, otherwise the possibility of more than two servers being down at once is high, and the grid is then unavailable for writes.</div>


<div><br></div><div>So, why would you want to set H very high, rather than just sticking with the 3/7/10 parameters provided by default?</div><div><br></div><div>There are two reasons you might want to increase H.  The first is to increase read-reliability and the second is so that you can increase K and reduce expansion while maintaining a certain level of read-reliability.  For purposes of determining the likelihood that a file will be available at some point in the future, I ignore N.  Setting H and N to different values is basically saying "I'll accept one level of reliability, but if I happen to get lucky I'll get a higher one".  That's fine, but when determining what parameters to choose, it's H and K that make the difference.  In fact if S happens to decline so that at the moment of your upload S=H, then any value of N > H is a waste.</div>


<div><br></div><div>If you want to find out what kinds of reliability you can expect from different parameters, there's a tool in the Tahoe source tree.  Unfortunately, I haven't done the work to make it available from the web UI, but if you want you can use it like this:</div>


<div><br></div><div>1.  Go to the tahoe/src directory.</div><div>2.  Run python without any command-line arguments to start the python interpreter.</div><div>3.  Type "import allmydata.util.statistics as s" to import the statistics module and give it a handy label (s)</div>


<div>4.  Type "s.pr_file_loss([p]*H, K)", where "p" is the server reliability, and H and K are the values you want to evaluate.</div><div><br></div><div>What value to use for p?  Well, ideally it's the probability that the data on the server will _not_ become lost before your next repair cycle.  To be conservative, I just use the server _availability_ target, which I'm proposing is 0.95.</div>


<div><br></div><div>The value you get is an estimate of the likelihood that your file will be lost before the next repair cycle.  If you want to understand how it's calculated and maybe argue with me about its validity, read my lossmodel paper (in the docs dir).  I think it's a very useful figure.</div>


<div><br></div><div>However, unless you're only storing one file, it's only part of the story.  Suppose you're going to store 10,000 files.  On a sufficiently-large grid (which volunteergrid2 will not be), you can model the survival or failure of each file independently, which means the probability that all of your files survive is "(1-s.pr_file_loss([p]*H, K))**10000".  Since volunteergrid2 will not be big enough for the independent-survival model to be accurate, the real estimate would fall somewhere between that figure and "1-s.pr_file_loss([p]*H, K)", which is the single-file survival probability.  To be conservative, I choose to pay attention to lower probability, which is the 10,000-file number.</div>


<div><br></div><div>Anyway, if you use that tool and spend some time playing with different values of H and K, what you find is that if you increase H you can increase K and reduce your expansion factor while maintaining your survival probability.  If you think about it, this makes intuitive sense, because although you're decreasing the amount of redundancy, you're actually increasing the number of servers that must fail in order for your date to get lost.  With 3/7, if five servers fail, your data is gone.  With 7/15, nine servers must fail.  With 35/50, 16 must fail.  Of course that's five out of seven, nine out of 15 and 16 out of 50, but still, with relatively high availability numbers, the odds of those failure rates are very close to the same.</div>


<div><br></div><div>From a read-performance perspective there's also some value in increasing K, because it will allow more parallelism of downloads -- at least in theory.  With the present Tahoe codebase that doesn't help as much as it should, but it will be fixed eventually.  (At present, you do download in parallel from K servers, but all K downloads are limited to the speed of the slowest, so your effective bandwidth is K*min(server_speeds).  If that were fixed, it would just be the sum of the bandwidth available to the K servers).</div>


<div><br></div><div>So, if we can take as a given that larger values of K and H are a good thing (and I'm happy to go into more detail about why that is if anyone likes; I've glossed over a lot here), then the best way to choose your parameters is to, ideally, set H=S and then choose the largest K that gives you the level of reliability you're looking for.</div>


<div><br></div><div>But if you set H=S, then even a single server being unavailable means that the grid is unavailable for writes.  So you want to set H a little smaller than S.  How much smaller?  That depends on what level of server availability you have, and what level of write-availability you require.</div>


<div><br></div><div>I'd like to have 99% write-availability.  If we have a 95% individual server availability and a grid of 20 servers, the probability that at least a given number of servers is available at any given moment is:</div>


<div><br></div><div>20 servers: 35.8%</div><div>19 servers: 73.6%</div><div>18 servers: 92.5%</div><div>17 servers: 98.4%</div><div>16 servers: 99.7%</div><div>15 servers: 99.9%</div><div><br></div><div>Again, if anyone would like to understand the way I calculated those, just ask.</div>


<div><br></div><div>At 99.9% availability, if I can't write to the grid it's more likely because my network connection is down than because there aren't enough servers to satisfy H=15.</div><div><br></div><div>


So, that's why I'd really like everyone to commit to trying to maintain 95+% availability on individual servers.  In practice if you have a situation which takes your box down for a few days, it's not a huge deal, because more than likely most of the nodes will have >95% availability, but what we don't want is a situation (like we have over on volunteergrid1) where a server is unavailable for weeks.</div>


<div><br></div><div>If you can't commit to keeping your node available nearly all the time, I would rather that you're not in the grid.  Sorry if that seems harsh, but I really want this to be a production grid that we can actually use with very high confidence that will always work, for both read and write.</div>


<div><br></div><div>Also, sorry for the length of this e-mail :-)</div><div><div><br>-- <br>Shawn<br>

</div></div>

<br>_______________________________________________<br>

volunteergrid2-l mailing list<br>

<a href="mailto:volunteergrid2-l@tahoe-lafs.org">volunteergrid2-l@tahoe-lafs.org</a><br>

<a href="http://tahoe-lafs.org/cgi-bin/mailman/listinfo/volunteergrid2-l" target="_blank">http://tahoe-lafs.org/cgi-bin/mailman/listinfo/volunteergrid2-l</a><br>

<br></blockquote></div><br></div>