[tahoe-dev] Bayesian Approach to Black Swans
Shawn Willden
shawn-tahoe at willden.org
Sun Apr 5 18:34:42 PDT 2009
On Sunday 05 April 2009 03:32:35 pm Josh Wilcox wrote:
> I don't understand why Black Swans should be impossible to model
> effectively.
The reason you can't model them is because they're too rare to allow you to
determine a probability distribution. If you've been watching for 10 years
and seen a single event, what's the probability of seeing another in the next
10 years? You have absolutely no way of knowing.
In the case of a distributed file system, one sort of black swan I could
imagine encountering is a bug that causes massive, widespread corruption.
Once you fix the bug, you know that particular black swan will never recur
(assuming you write and use test cases, etc.), but you can't know if there's
another similarly-serious bug lurking in the code.
Another sort of black swan I could imagine is a widespread catastrophe, but
IMO those are irrelevant. Odds are that if something like that happens
losing some data will be the least of our concerns.
In any event, I don't worry too much about the effects of black swans on the
loss model. The primary sorts of failures that we're trying to mitigate are
hardware failures and user errors, both of which are very common.
Shawn.
More information about the tahoe-dev
mailing list