Fwd: Erasure Coding
David Vorick
david.vorick at gmail.com
Sun Dec 1 16:25:21 UTC 2013
---------- Forwarded message ----------
From: David Vorick <david.vorick at gmail.com>
Date: Sun, Dec 1, 2013 at 11:25 AM
Subject: Re: Erasure Coding
To: Alex Elsayed <eternaleye at gmail.com>
Alex, thanks for those resources. I will check them out later this week.
I'm trying to create something that will function as a market for cloud
storage. People can rent out storage on the network for credit (a
cryptocurrency - not bitcoin but something heavily inspired from bitcoin
and the other altcoins) and then people who have credit (which can be
obtained by trading over an exchange, or by renting to the network) can
rent storage from the network.
So the clusters will be spread out over large distances. With RAID5 and 5
disks, the network needs to communicate 4 bits to recover each lost bit.
That's really expensive. The computational cost is not the concern, the
bandwidth cost is the concern. (though there are computational limits as
well)
When you buy storage, all of the redundancy and erasure coding happens
behind the scenes. So a network that needs 3x redundancy will be 3x as
expensive to rent storage from. To be competitive, this number should be as
low as possible. If we had Reed-Solomon and infinite bandwidth, I think we
could safely get the redundancy below 1.2. But with all the other
requirements, I'm not sure what a reasonable minimum is.
Since many people can be renting many different clusters, each machine on
the network may (will) be participating in many clusters at once (probably
in the hundreds to thousands). So the cost of handling a failure should be
fairly cheap. I don't think this requirement is as extreme as it may sound,
because if you are participating in 100 clusters each renting an average of
50gb of storage, your overall expenses should be similar to participating
in a few clusters each renting an average of 1tb. The important part is
that you can keep up with multiple simultaneous network failures, and that
a single node is never a bottleneck in the repair process.
We need 100s - 1000s of machines in a single cluster for multiple reasons.
The first is that it makes the cluster roughly as stable as the network as
a whole. If you have 100 machines randomly selected from the network, and
on average 1% of the machines on the network fail per day, your cluster
shouldn't stray too far from 1% failures per day. Even more so if you have
300 or 1000 machines. But another reason is that the network is used to
mine currency based on how much storage you are contributing to the
network. If there is some way you can trick the network into thinking you
are storing data when you aren't (or you can somehow lie about the volume),
then you've broken the network. Having many nodes in every cluster is one
of the ways cheating is prevented. (there are a few others too, but
off-topic).
Cluster size should be dynamic (fountain codes?) to support a cluster that
grows and shrinks in demand. Imagine if some of the files become public
(for example, youtube starts hosting videos over this network). If one
video goes viral, the bandwidth demands are going to spike and overwhelm
the network. But if the network can automatically expand and shrink as
demand changes, you may be able to solve the 'Reddit hug' problem.
And finally, machines that only need to be on some of the time gives the
network a tolerance for things like power failures, without needing to
immediately assume that the lost node is gone for good.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tahoe-lafs.org/pipermail/tahoe-dev/attachments/20131201/22a4ebae/attachment.html>
More information about the tahoe-dev
mailing list