[tahoe-dev] report from the volunteergrid

Wed Apr 27 10:01:18 PDT 2011

Folks:

I just wrote the following note the the volunteergrid-l mailing list.
I thought it might be of interest to others.

Regards,

Zooko

[excerpted and edited by Zooko before posting to the public tahoe-dev list.]

---------- Forwarded message ----------
From: Zooko O'Whielacronx <zooko at zooko.com>
Date: Wed, Apr 27, 2011 at 10:59 AM
Subject: Re: [volunteergrid-l] Hard disk crash / leaving the volunteergrid

I'm thinking of moving my blog from the public test grid:

http://insecure.tahoe-lafs.org/uri/URI:DIR2-RO:ixqhc4kdbjxc7o65xjnveoewym:5x6lwoxghrd5rxhwunzavft2qygfkt27oj3fbxlq4c6p45z5uneq/blog.html

to the volunteergrid, and establishing a read-only proxy (nginx I
guess) so that even people who don't run their own tahoe-lafs gateway
can read my blog.

This mailing list is awfully quiet, and David Triendl's absence might
hurt, but overall the volunteergrid seems extremely robust to me. We
have currently eleven running servers operated by—I think—ten people.
If all ten of us are committed to maintaining each other's shares
long-term then this is the most robust storage system I've ever seen.
:-)

Think about it: in order to delete any data, an accident or malicious
attack would have to destroy the persistent storage of a substantial
number of servers operated by independent people and (I assume) mostly
located in separate geographical, legal, and networking domains.

My only concern is capacity—I'm not sure how many of the servers are
full or nearly full and to what degree their operators have the
intention and the resources (money) to expand their capacity in the
future.

I'm personally currently paying around USD 6.00 per month to maintain
50 GB of capacity on Amazon's (famously unreliable) Elastic Block
Store. I hope to move those shares over to Amazon's S3, possibly
Reduced-Reliability-S3, in the future. I intend to start paying more
to increase capacity if that 50 GB starts filling up. It currently has
about 30 GB in use.

I just added a comment to a relevant ticket:

http://tahoe-lafs.org/trac/tahoe-lafs/ticket/648# collect server
capacities and put them on the welcome page, output of 'df' for SFTP,
etc.

Here are the perspectives on servers from my gateway on my local
laptop, and from my gateway running on Amazon EC2. There is one
difference: my EC2 gateway is connected to "kpreid at eider
xpu6hdv46lnkzfxox5tfn36mo3oo25eh" but my local laptop isn't. I'm
behind NAT here, so I infer that eider is also behind NAT and thus
eider can connect to my EC2 instance but not to my laptop. Kevin: if
it isn't too much trouble, please configure a port-forward so that my
laptop can connect to eider. That would make it be twelve servers
operated by ten people.

By the way, I transferred all the old shares from my *old* laptop
(ootles) to my EC2 instance, "a2". I intend to post to this list at
some point a HOWTO about how I did that. (It's pretty simple.)

Oh, I do have one more concern about data preservation: garbage
collection. The way things stand now most people have (I infer) turned
on garbage collection, which means if I suffer some personal
disruption which results in my not renewing my leases every month then
my data would be deleted. I consider this by far the biggest
preservation risk in the current system and I'm really not satisfied
with it. I'm not yet precisely sure how to improve it, though...

Regards,

Zooko