[tahoe-dev] Bayesian Approach to Black Swans

Sun Apr 5 18:34:42 PDT 2009

On Sunday 05 April 2009 03:32:35 pm Josh Wilcox wrote:
>   I don't understand why Black Swans should be impossible to model
> effectively.

The reason you can't model them is because they're too rare to allow you to 
determine a probability distribution.  If you've been watching for 10 years 
and seen a single event, what's the probability of seeing another in the next 
10 years?  You have absolutely no way of knowing.

In the case of a distributed file system, one sort of black swan I could 
imagine encountering is a bug that causes massive, widespread corruption.  
Once you fix the bug, you know that particular black swan will never recur 
(assuming you write and use test cases, etc.), but you can't know if there's 
another similarly-serious bug lurking in the code.

Another sort of black swan I could imagine is a widespread catastrophe, but 
IMO those are irrelevant.  Odds are that if something like that happens 
losing some data will be the least of our concerns.

In any event, I don't worry too much about the effects of black swans on the 
loss model.  The primary sorts of failures that we're trying to mitigate are 
hardware failures and user errors, both of which are very common.

	Shawn.