[tahoe-dev] Tahoe-LAFS v1.8 planning / Administrivia / Big Picture

Zooko O'Whielacronx zooko at zooko.com
Wed Aug 4 02:23:52 UTC 2010


On Tue, Aug 3, 2010 at 12:22 PM, Francois Deppierraz
<francois at ctrlaltdel.ch> wrote:
>
> I'd really love to see location/rack/server-awareness in the peer-selection
> process.

Thank you for the feedback! I think a lot of people strongly want this
feature. This is described on [wiki:ServerSelection].

What's the next step? I still don't know exactly what the UX would be
for this feature. Would you have a flat file containing a list of
serverids followed by categories like this:

# serverid category:id, category:id, category:id, ...
alt6cjddwfnwrnct4lx2ypwricrgtoam colo:us-west-1a, rack:5, chassis:3
cufg4m4c7bfujnf5tkhjdazicn7ifkae colo:us-west-1b, rack:1, chassis:1
e5itfysbe3qeqgzflxdnm6ypraufj6vj colo:singapore-1, rack:1, chassis:1
fp3xjndgjt2npubdl2jqqb26clanyag7 colo:singapore-1, rack:1, chassis:2

and would the server selection algorithm automatically use the
following as its highest-priority requirement: "spread the shares as
evenly as possible among the different numbers of each category"? And
if there were more than one category would it treat each successive
one as the next-highest-priority after the previous priorities were
satisfied?

> One of the current deployment I did is a grid of 3 servers, each with 24
> SATA disks. One Tahoe-LAFS storage node per disk and each server is located
> in a different datacenter.

Sweet! How is it working? Could you give us some problem reports,
success reports, benchmarks? :-)

> With the default 3-of-10 encoding on such setup, I have currently no way to
> ensure that two servers can fail without any impact on file availability.

Until we fix ServerSelection to do what you want, you could try to
accomplish it by changing your parameters. You have 72 storage
servers. If you set M=72 and K=22 then you'll have approximately the
same redundancy as M=10 and K=3. Then a normal upload would put one
share on each of the servers, and you could lose any two of those
24-drive servers and still keep the file.

Now what should you set Servers Of Happiness, H, to? If you set H to
70 then uploads will abort if any fewer than 70 servers are available
at that time. If 70 separate servers have the file then even if two of
your 24-node machines are gone you'll still have 22 storage servers
left. :-)

I've never seen real production use of K and M values that large.
Everyone always uses the default M=10. K=3. I did actually set K=15
and M=30-something for a while on a test grid that had 15 live
servers. It worked okay. If you try setting something like M=72, K=22,
H=70 then please do run measurements of performance and do please
report your results to this list! :-)

Regards,

Zooko

http://tahoe-lafs.org/trac/tahoe-lafs/wiki/ServerSelection


More information about the tahoe-dev mailing list