[tahoe-dev] Questions

Wed May 12 11:33:27 PDT 2010

On 5/12/10 9:44 AM, Jason Wood wrote:
> Hi,

Welcome!

> Suppose I have a file of 100GB and 2 storage nodes each with 75GB
> available, will I be able to store the file or does it have to fit
> within the realms of a single node?

I think tahoe does what you want here. What matters is the size of the
shares that Tahoe generates, not the size of the original file, and you
have control over those shares.

The ability to store the file will depend upon how you set the encoding
parameters: you get to choose the tradeoff between expansion (how much
space gets used) and reliability. The default settings are "3-of-10"
(very conservative), which means the file is encoded into 10 shares, and
any 3 will be sufficient to reconstruct it. That means each share will
be 1/3rd the size of the original file (plus a small overhead, less than
0.5% for large files). For your 100GB file, that means 10 shares, each
of which is 33GB in size, which would not fit (it could get two shares
on each server, but it couldn't place all ten, so it would return an error).

But you could set the encoding to 2-of-2, which would give you two 50GB
shares, and it would happily put one share on each server. That would
store the file, but it wouldn't give you any redundancy: a failure of
either server would prevent you from recovering the file.

You could also set the encoding to 4-of-6, which would generate six 25GB
shares, and put three on each server. This would still be vulnerable to
either server being down (since neither server has enough shares to give
you the whole file by itself), but would become tolerant to errors in an
individual share (if only one share file were damaged, there are still
five other shares, and we only need four). A lot of disk errors affect
only a single file, so there's some benefit to this even if you're still
vulnerable to a full disk/server failure.

So, you can set the encoding parameters (in the "tahoe.cfg" file) to
whatever you like, to meet your goals.

> Do I need to shutdown all clients/servers to add a storage node?

Nope. You can add or remove clients or servers anytime you like. The
central "Introducer" is responsible for telling clients and servers
about each other, and it acts as a simple publish-subscribe hub, so
everything is very dynamic. Clients re-evaluate the list of available
servers each time they do an upload.

This is great for long-term servers, but can be a bit surprising in the
short-term: if you've just started your client and upload a file before
it has a chance to connect to all of the servers, your file may be
stored on a small subset of the servers, with less reliability than you
wanted. We're still working on a good way to prevent this while still
retaining the dynamic server discovery properties (probably in the form
of a client-side configuration statement that lists all the servers that
you expect to connect to, so it can refuse to do an upload until it's
connected to at least those). A list like that might require a client
restart when you wanted to add to this "required" list, but we could
implement such a feature without a restart requirement too.

> Finally, I see I can link files on the cluster (very useful!), does this
> make an actual link or copy the data? Does the target file have to
> reside on the same storage node as the source file? I think I know the
> answer to this but just want to clarify.

It's just a link. From the point of view of the directories, each file
just lives "in the cloud", and is not associated with any particular
storage nodes: each file has a "filecap" string, and directories are
just lists of filecaps.

Each file has shares on a set of storage nodes (a different set for each
file). Directories are just special kinds of files, so directories also
have shares on a set of storage nodes. The storage nodes used for a
directory are unrelated to the ones used for the files therein.

"Copying" an immutable file from one directory to another just creates a
second link to that file. In fact, "uploading a file to a directory"
actually has two steps: first the file is uploaded into the grid and
returns a filecap, second the directory is modified (by adding the new
filecap to its list). So copying from one directory to another just does
the second step (modifies the target directory), and the original file
isn't touched.

Of course, copying a *mutable* file is different, because the copy must
be a new object (changing the copy should not cause the original to
change). In that case, the data itself must be copied. We don't yet
support efficient large mutable files, and Tahoe uses immutable files by
default, so in practice you don't tend to run into this very much.

Hope that helps! Let us know how it goes!

cheers,
 -Brian