Fwd: Erasure Coding
Natanael
natanael.l at gmail.com
Mon Dec 2 08:07:29 UTC 2013
You just need to fake a lot of connections, then.
- Sent from my phone
Den 2 dec 2013 06:15 skrev "David Vorick" <david.vorick at gmail.com>:
> You don't get mining money until you are actually storing data. Nodes are
> randomly selected. If you randomly end up hosting the same piece of data
> 10,000 times, you will be able to cheat.
>
> But if the network is sufficiently large, the probability of 1 node
> getting to host the same data twice is very slim, and engineering a
> "coincidence" would be very expensive because nodes are randomly selected
> and the minimum amount of time you can rent storage for is 1 month.
> Furthermore, there is rotation, meaning that after a certain amount of time
> a different random host will be selected to keep data, which means any
> engineered coincidence will not last very long.
>
>
> On Sun, Dec 1, 2013 at 5:29 PM, Natanael <natanael.l at gmail.com> wrote:
>
>> Yeah, so how will anybody stop you from hosting 10 GB but pretending to
>> be 10 000 nodes, thus making as much as somebody that would be storing 100
>> TB?
>>
>> - Sent from my phone
>> Den 1 dec 2013 22:37 skrev "David Vorick" <david.vorick at gmail.com>:
>>
>> Thanks Dirk, I'll be sure to check all those out as well. Haven't yet
>>> heard of spinal codes.
>>>
>>> Natanael, all of the mining is based on the amount of storage that you
>>> are contributing. If you are hosting 100 nodes each with 10GB, you will
>>> mine the same amount as if you had just one node with 1TB. The only way you
>>> could mine extra credits is if you could convince the system that you are
>>> hosting more storage than you are actually hosting.
>>>
>>>
>>> On Sun, Dec 1, 2013 at 2:40 PM, <jason.johnson at p7n.net> wrote:
>>>
>>>> What if you gave them the node to use. Like they had to register for a
>>>> node. I started something like this but sort of stopped because I’m lazy.
>>>>
>>>>
>>>>
>>>> *From:* tahoe-dev-bounces at tahoe-lafs.org [mailto:
>>>> tahoe-dev-bounces at tahoe-lafs.org] *On Behalf Of *Natanael
>>>> *Sent:* Sunday, December 1, 2013 1:37 PM
>>>> *To:* David Vorick
>>>> *Cc:* tahoe-dev at tahoe-lafs.org
>>>> *Subject:* Re: Fwd: Erasure Coding
>>>>
>>>>
>>>>
>>>> Can't you pretend to run more nodes than you actually are running in
>>>> order to "mine" more credits? What could prevent that?
>>>>
>>>> - Sent from my phone
>>>>
>>>> Den 1 dec 2013 17:25 skrev "David Vorick" <david.vorick at gmail.com>:
>>>>
>>>>
>>>>
>>>> ---------- Forwarded message ----------
>>>> From: *David Vorick* <david.vorick at gmail.com>
>>>> Date: Sun, Dec 1, 2013 at 11:25 AM
>>>> Subject: Re: Erasure Coding
>>>> To: Alex Elsayed <eternaleye at gmail.com>
>>>>
>>>> Alex, thanks for those resources. I will check them out later this week.
>>>>
>>>> I'm trying to create something that will function as a market for cloud
>>>> storage. People can rent out storage on the network for credit (a
>>>> cryptocurrency - not bitcoin but something heavily inspired from bitcoin
>>>> and the other altcoins) and then people who have credit (which can be
>>>> obtained by trading over an exchange, or by renting to the network) can
>>>> rent storage from the network.
>>>>
>>>> So the clusters will be spread out over large distances. With RAID5 and
>>>> 5 disks, the network needs to communicate 4 bits to recover each lost bit.
>>>> That's really expensive. The computational cost is not the concern, the
>>>> bandwidth cost is the concern. (though there are computational limits as
>>>> well)
>>>>
>>>> When you buy storage, all of the redundancy and erasure coding happens
>>>> behind the scenes. So a network that needs 3x redundancy will be 3x as
>>>> expensive to rent storage from. To be competitive, this number should be as
>>>> low as possible. If we had Reed-Solomon and infinite bandwidth, I think we
>>>> could safely get the redundancy below 1.2. But with all the other
>>>> requirements, I'm not sure what a reasonable minimum is.
>>>>
>>>> Since many people can be renting many different clusters, each machine
>>>> on the network may (will) be participating in many clusters at once
>>>> (probably in the hundreds to thousands). So the cost of handling a failure
>>>> should be fairly cheap. I don't think this requirement is as extreme as it
>>>> may sound, because if you are participating in 100 clusters each renting an
>>>> average of 50gb of storage, your overall expenses should be similar to
>>>> participating in a few clusters each renting an average of 1tb. The
>>>> important part is that you can keep up with multiple simultaneous network
>>>> failures, and that a single node is never a bottleneck in the repair
>>>> process.
>>>>
>>>>
>>>>
>>>> We need 100s - 1000s of machines in a single cluster for multiple
>>>> reasons. The first is that it makes the cluster roughly as stable as the
>>>> network as a whole. If you have 100 machines randomly selected from the
>>>> network, and on average 1% of the machines on the network fail per day,
>>>> your cluster shouldn't stray too far from 1% failures per day. Even more so
>>>> if you have 300 or 1000 machines. But another reason is that the network is
>>>> used to mine currency based on how much storage you are contributing to the
>>>> network. If there is some way you can trick the network into thinking you
>>>> are storing data when you aren't (or you can somehow lie about the volume),
>>>> then you've broken the network. Having many nodes in every cluster is one
>>>> of the ways cheating is prevented. (there are a few others too, but
>>>> off-topic).
>>>>
>>>>
>>>>
>>>> Cluster size should be dynamic (fountain codes?) to support a cluster
>>>> that grows and shrinks in demand. Imagine if some of the files become
>>>> public (for example, youtube starts hosting videos over this network). If
>>>> one video goes viral, the bandwidth demands are going to spike and
>>>> overwhelm the network. But if the network can automatically expand and
>>>> shrink as demand changes, you may be able to solve the 'Reddit hug' problem.
>>>>
>>>> And finally, machines that only need to be on some of the time gives
>>>> the network a tolerance for things like power failures, without needing to
>>>> immediately assume that the lost node is gone for good.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> tahoe-dev mailing list
>>>> tahoe-dev at tahoe-lafs.org
>>>> https://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev
>>>>
>>>>
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tahoe-lafs.org/pipermail/tahoe-dev/attachments/20131202/f21087c3/attachment-0001.html>
More information about the tahoe-dev
mailing list