[volunteergrid2-l] Failure Analysis

Shawn Willden shawn at willden.org
Wed Feb 2 17:38:57 PST 2011


On Wed, Feb 2, 2011 at 10:31 AM, Jody Harris <jharris at harrisdev.com> wrote:

> All failure analysis should be in a single document per node, with
> "overlapping failure" links between nodes. Once we start getting a handle on
> the grid and the failures, we will be aware of having too many nodes under
> single failure points.
>

Very cool.  If you look at my Tahoe-LAFS reliability modeling paper, you'll
see I've described how to compute reliability estimates that accurately
incorporate both overlapping and non-overlapping failure modes.  At bottom
it's all really very simple.  If you can identify the failure modes, and
then assign probabilities to each (hard to do accurately, but I think we can
take some reasonable SWAGs), then you can model each individual failure with
a trivial probability mass function and combine those PMFs in fairly
straightforward ways to build up an overall reliability estimate for your
files.

-- 
Shawn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tahoe-lafs.org/cgi-bin/mailman/private/volunteergrid2-l/attachments/20110202/2be08d83/attachment.html>


More information about the volunteergrid2-l mailing list