<p dir="ltr">Can't you pretend to run more nodes than you actually are running in order to "mine" more credits? What could prevent that? </p>
<p dir="ltr">- Sent from my phone</p>
<div class="gmail_quote">Den 1 dec 2013 17:25 skrev "David Vorick" <<a href="mailto:david.vorick@gmail.com">david.vorick@gmail.com</a>>:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr"><br><br><div class="gmail_quote">---------- Forwarded message ----------<br>From: <b class="gmail_sendername">David Vorick</b> <span dir="ltr"><<a href="mailto:david.vorick@gmail.com" target="_blank">david.vorick@gmail.com</a>></span><br>
Date: Sun, Dec 1, 2013 at 11:25 AM<br>Subject: Re: Erasure Coding<br>To: Alex Elsayed <<a href="mailto:eternaleye@gmail.com" target="_blank">eternaleye@gmail.com</a>><br><br><br><div dir="ltr"><div>Alex, thanks for those resources. I will check them out later this week.<br>
<br></div><div>I'm trying to create something that will function as a market for cloud storage. People can rent out storage on the network for credit (a cryptocurrency - not bitcoin but something heavily inspired from bitcoin and the other altcoins) and then people who have credit (which can be obtained by trading over an exchange, or by renting to the network) can rent storage from the network.<br>
<br></div><div>So the clusters will be spread out over large distances. With RAID5 and 5 disks, the network needs to communicate 4 bits to recover each lost bit. That's really expensive. The computational cost is not the concern, the bandwidth cost is the concern. (though there are computational limits as well)<br>
<br></div><div>When you buy storage, all of the redundancy and erasure coding happens behind the scenes. So a network that needs 3x redundancy will be 3x as expensive to rent storage from. To be competitive, this number should be as low as possible. If we had Reed-Solomon and infinite bandwidth, I think we could safely get the redundancy below 1.2. But with all the other requirements, I'm not sure what a reasonable minimum is.<br>
<br></div><div>Since many people can be renting many different clusters, each machine on the network may (will) be participating in many clusters at once (probably in the hundreds to thousands). So the cost of handling a failure should be fairly cheap. I don't think this requirement is as extreme as it may sound, because if you are participating in 100 clusters each renting an average of 50gb of storage, your overall expenses should be similar to participating in a few clusters each renting an average of 1tb. The important part is that you can keep up with multiple simultaneous network failures, and that a single node is never a bottleneck in the repair process.<br>
</div><div><br></div><div>We need 100s - 1000s of machines in a single cluster for multiple reasons. The first is that it makes the cluster roughly as stable as the network as a whole. If you have 100 machines randomly selected from the network, and on average 1% of the machines on the network fail per day, your cluster shouldn't stray too far from 1% failures per day. Even more so if you have 300 or 1000 machines. But another reason is that the network is used to mine currency based on how much storage you are contributing to the network. If there is some way you can trick the network into thinking you are storing data when you aren't (or you can somehow lie about the volume), then you've broken the network. Having many nodes in every cluster is one of the ways cheating is prevented. (there are a few others too, but off-topic).<br>
</div><div><br></div><div>Cluster size should be dynamic (fountain codes?) to support a cluster that grows and shrinks in demand. Imagine if some of the files become public (for example, youtube starts hosting videos over this network). If one video goes viral, the bandwidth demands are going to spike and overwhelm the network. But if the network can automatically expand and shrink as demand changes, you may be able to solve the 'Reddit hug' problem.<br>
<br></div><div>And finally, machines that only need to be on some of the time gives the network a tolerance for things like power failures, without needing to immediately assume that the lost node is gone for good.<br></div>
<div><br></div></div>
</div><br></div>
<br>_______________________________________________<br>
tahoe-dev mailing list<br>
<a href="mailto:tahoe-dev@tahoe-lafs.org">tahoe-dev@tahoe-lafs.org</a><br>
<a href="https://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev" target="_blank">https://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev</a><br>
<br></blockquote></div>