[tahoe-dev] public backup grid

Shawn Willden shawn at willden.org
Sat Jun 18 06:47:31 PDT 2011


Welcome green!

On Fri, Jun 17, 2011 at 9:59 PM, green <greenfreedom10 at gmail.com> wrote:

> The tahoe-lafs package just appeared in Debian repositories.


Awesome!  It's been in Ubuntu for a while, but Debian should give it much
wider exposure.


> Okay, so searching for actual grid backup implementations, I find these:
> http://tahoe-lafs.org/trac/tahoe-lafs/wiki/VolunteerGrid
> http://bigpig.org/
>
> Two great ideas, but maximum 20 servers for each.  Only one mentions a
> mailing list, and subscription is moderated and archives are private.  Why
> the secrecy, and why the limit?  If a grid is better with fewer members,
> why
> can a project not have multiple grids?
>

The reason for the secrecy and limited size is because Tahoe as yet has no
way to prevent abuse.  With a small, closed community it's relatively easy
to maintain a sense of fair play and to feel that your "neighbors" are
trustworthy.

The fairness needed to make a Tahoe grid operate well comprises a few
factors, some more obvious than others.  In fact, the second volunteer grid
has created some additional rules based on the experiences of the first.

The most obvious rule is that people shouldn't use more storage than they
contribute to it, including factoring in file expansion.  If you store 1 MB
of data in the grid and you're using the default settings, your data
consumed 3.3 MB of storage so you need to provide at least that much to the
grid.  Obviously, people that don't run a storage server shouldn't use the
grid at all -- which means that the grid introducer FURL must be kept secret
and gateways must be protected.

Another issue is availability.  A node that is down a lot doesn't really
contribute, because when it's down it doesn't store or provide shares.
 Volunteergrid 1 had an issue for a while where a significant number of
nodes were down, which degraded utility.

A less obvious issue can arise even when everyone keeps their nodes on-line
and consumes less than they provide.  If there is a wide disparity in the
amount of storage provided by different nodes in the system, one or two
people who provide a lot of storage and -- in fairness! -- expect to be able
to consume a correspondingly large amount can saturate the grid and make
uploads impossible.

Suppose, for example, that there are 9 nodes that each provide 100 MB of
storage, totaling 900 MB, and one node that provides 2 GB.  The large node
operator assumes that he should be able to consume 2 GB, but after uploading
1 GB everyone finds that the grid is unable to accept new files.  Only 1 GB
of space has been used, but it's been spread across the 10 nodes evenly, so
the 9 small nodes are full while the big one has 1.9 GB free -- but because
Tahoe needs to spread its shares across multiple servers to get the
reliability benefits, all uploads fail.

There was a period of time during which the combination of unavailable and
full nodes made Volunteergrid 1 unable to accept uploads with a decent
dispersion, even though there was substantial capacity available.

Because of these issues, Volunteergrid 2 has adopted some additional rules
around uptime and storage capacity.  We require you to commit to 95% uptime
and to provide at least 500 GB to the grid and no more than 1 TB in one
node.  If you want to provide/consume more than 1 TB you need to operate
multiple storage nodes -- and we further require that these nodes not be in
the same location.  All of these rules are more in the nature of gentlemen's
agreements than anything else, in part because we're all willing to be
flexible where it makes sense and in part because there's no real
enforcement mechanism.  But keeping the community smallish (I think we'd be
okay with more than 20 nodes) and getting new members to agree to the rules
before giving them access to the introducer FURL seems to work well.

BTW, VG2 is actively recruiting new members who are looking for
highly-available high-volume (~1TB) storage.

-- 
Shawn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tahoe-lafs.org/pipermail/tahoe-dev/attachments/20110618/d707c4b7/attachment.html>


More information about the tahoe-dev mailing list