[volunteergrid2-l] Lots of remarks and questions ;-)

Shawn Willden shawn at willden.org
Thu Dec 8 20:48:19 UTC 2011


On Thu, Dec 8, 2011 at 10:12 AM, Christoph Langguth <
christoph at rosenkeller.org> wrote:

> b2) What is the actual meaning of "shares.needed", "shares.happy", and
> "shares.total" ? I have been playing around with these using my (only)
> local node, and here are my observations so far:
> - each node ( = connected tahoe-lafs instance) provides multiple shares
> for storing files (how many? where can that be configured?)
> - shares.total is the total number of shares (over all connected nodes)
> that an upload will fill (regardless of the node where the file is uploaded
> -- in other words: some of them may be on the same host)
>

Yes.  This is the number of shares which will be distributed.


> - shares.needed is the minimum number of shares a file needs to be
> "distributed" to. However, these shares may all be on the same node.
>

shares.needed is the number of shares required to reassemble a file.  If
that many shares are available in the grid, you can get it back.


> - shares.happy is the number of *different* nodes that must hold a copy of
> (part of) the file.
>

It really should be servers.happy.  It is the number of separate nodes
which must hold shares for the upload to succeed.  It used to be the number
of shares, but that leads to situations where your file is uploaded only to
a small number of nodes, so you really don't have the redundancy you think
you do.

Ideally, I would recommend that you set shares.total and shares.happy to
the same value, and that it be the same as the number of nodes in the grid.
 This would ensure that every node in the grid gets a piece of your file.
 Then you can pick a value for shares.needed that gives you an appropriate
degree of redundancy.

However, doing that will cause you poor write availability, meaning you
often won't be able to upload.  In order to avoid that, I recommend setting
shares.total to a value slightly smaller than the number of nodes in the
grid, and shares.happy to shares.total - 1.  Then pick an appropriate
shares.needed.

b3) Does it make sense to use a "helper" service, and if so, is there any
> use in making its FURL publicly available?


The purpose of a helper is to reduce the impact of uploads on a slow
Internet connection.  When you upload a file, you have to upload S*N/K
bytes, where S is the size of the file, N is shares.total and K is
shares.needed.  When you upload a file to a helper, you upload S bytes to
the helper and it sends S*N/K bytes out to the servers in the grid.  So a
helper with a fast connection can help nodes with slow connections.

I'm not sure what the implication of making a helper FURL public is.  I
*think* you'd still need the introducer FURL in order to use the helper,
but I'm not sure.  I think we should protect helper FURLs.


> I guess it won't harm to leave it on locally, but could other folks in the
> grid also benefit somehow if they knew "my" helper?


Using a local helper doesn't do you any good.  The only purpose of a helper
would be to benefit others who have slower connections than you do.


> From what I understand, the purpose of the helper is to reduce upload
> delays by providing quick uploads and caching, and only then distributing
> data into the grid (as stated earlier, our host is connected pretty
> decently -- GBit Ethernet just 2 hops from the backbone).
>

The helper doesn't really do much caching.  It does hold a copy of the
encrypted file while it uploads it to the grid, but the uploader still has
to wait for the helper to finish before its upload is done.

That said, your connection certainly sounds like one the rest of us would
love to take advantage of!


> c1) Any objections or bad experiences with duplicity on tahoe? Or any
> further experiences or hints that may be helpful?
>

I looked into it a while ago and decided against it.  Unfortunately, I
don't remember why I decided against it.  I'm using "tahoe backup".


> c2) Assuming a total disaster where everything goes up in smoke locally,
> the only backup remaining would be the one in VG2. But it would be in a
> directory (Tahoe-URI) which is totally incomprehensible and impossible to
> remember. So as a safety net, would it be ok to post here and/or on the
> pigpig homepage a file -- encrypted, of course :-) -- with the tiny bits of
> vital information so that I could get them back from the HP, or from one of
> you guys, in case of disaster?
>

Assuming you use an alias to access your root directory (recommended), you
just need to save your aliases file, which you'll find in
.tahoe/private/aliases.

It's a very small bit of data, so it's easy to keep it in multiple places.
 Just keep it appropriately protected because anyone with access to it and
the introducer FURL will have access to all of your data.  Personally, I
PGP-encrypted it with a passphrase I won't forget, then printed out the
resulting ASCII-armored data and gave copies to several friends and family
members.  My wife also knows the passphrase and who has the copies, though
she'd need help getting a Tahoe node set up and connected so she could
restore the data in the event of my demise.


>
> c3) And the final question: My understanding is that orphaned files are
> meant to be avoided by expiring files after 365 days. My worst-case
> scenario now goes like this: disaster strikes, the backup is required --
> and some files have been purged on all nodes because they have been sitting
> around too long without being touched. I recall that there was some method
> to "touch" these files, but I can't remember or find it in the docs now.
> Can someone give any advice on how this is done in general, or possibly in
> particular with duplicity?
>

tahoe deep-check --repair --add-lease

If you're using the default alias, that's it.  If you use a different
alias, you have to specify the alias name.  If you're not using aliases,
specify the URI.

Put this command in a cron job, to run every couple of weeks (in case we
someday reduce the lease expiration time).

-- 
Shawn.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tahoe-lafs.org/cgi-bin/mailman/private/volunteergrid2-l/attachments/20111208/cfc8ca5f/attachment.html>


More information about the volunteergrid2-l mailing list