[volunteergrid2-l] Recommended settings
Shawn Willden
shawn at willden.org
Wed Jun 29 16:12:45 PDT 2011
On Wed, Jun 29, 2011 at 1:36 PM, Billy Earney <billy.earney at gmail.com>wrote:
> But if we set in the config file something like N=90%, which would mean to
> set N=T*90% (where T is the total number of nodes available at time of file
> upload)
>
To ask my next question easily, I need some more notation:
T_now = total number of servers available at time of upload
T_max = total number of servers in the grid, including those currently
unavailable.
What would be the point in setting N < T_now? The reason for possibly
wanting to set N < T_max is because it's possible that T_now < T_max which
would cause uploads to fail if N > T_now, but I don't see why you would want
to get less dispersion than is currently available.
> , then H could equal some minimal number of nodes necessary for an upload,
> calculated from R (R for reliability %).
>
> ** **
>
> If we assume that node availability is 95%, then what would H have to be to
> have a reliability of R? There’s probably a formula for this in the tahoe
> api.
>
I don't believe a closed-form formula is for calculating that is possible --
at least, I don't know how to do it and I've done more work on this math
than anyone else, AFAIK. However, there is an efficient way to calculate R
given shares-distributed and shares-required (K). So it's not difficult to
do a quick search of the space of the parameter you want to optimize.
Actually, though, in the case where you're optimizing dynamically, I think
the need for H disappears entirely. If you know how many servers are
available and accepting shares _now_, then that's the number of shares you
want to distribute. The parameter you would calculate, then, is K. You
would do a quick search of the range of possible K values, looking for the
largest K that gives you a reliability that is better than your required
reliability threshold R.
****
>
> My reliability threshold may be 99% for some files (willing to loose
> sometimes), but for other files it could be (99.999%). Which brings up
> another topic of allowing these to be entered from the command line when
> uploading files, since different files could have different R’s. Just my
> $0.02. J
>
Yep, it would ultimately be very nice to be able to set different Rs for
different files. And as long as we're dreaming, this could be used to
compute a required R for dirnodes, based on the required Rs for the files in
it (and, recursively, for dirnodes under it). This would ensure that
top-level dirnodes automagically become highly, highly reliable, which would
have prevented the worst of the allmydata problem.
But, talk is cheap :-)
--
Shawn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tahoe-lafs.org/cgi-bin/mailman/private/volunteergrid2-l/attachments/20110629/ba38a082/attachment-0001.html>
More information about the volunteergrid2-l
mailing list