[tahoe-dev] Automatic rebalancing
Greg Troxel
gdt at ir.bbn.com
Sun Dec 5 15:16:51 UTC 2010
Ravi Pinjala <ravi at p-static.net> writes:
> As far as description languages for data allocation go, Ceph has
> already solved this problem - check out the "CRUSH" algorithm.
> Basically, it's a description language for data placement that
> controls replication and data placement, and I think it also lets
> clients figure out which servers a piece of data is on without
> querying them first. IIRC, the code for it is in a separate library
> from the rest of Ceph, so it might be feasible to just put a thin
> python wrapper around it and use it.
That's very interesting - I have been thinking that as one sets up
multiple nodes controlling data placement is important to get the
intended redundancy against physical loss. But when you then start
thinking about server-controlled rebalancing and migration, it becomes
necessary to be able to express the placement rules programmatically for
evaluation by others, not just have them run on the client.
I would hope that we could come up with one schema that would satisfy
the needs of 95% of the grids. One obvious concern, arguably the
primary one, is physical loss/reliability correlation (what ceph seems
to thinking about). Another is policy; one might have data of a type
that is not permissible to store in some places (e.g., ITAR,
http://en.wikipedia.org/wiki/Data_Protection_Directive). So far these
two are orthogonal, and perhaps there are more.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 194 bytes
Desc: not available
URL: <http://tahoe-lafs.org/pipermail/tahoe-dev/attachments/20101205/f62f757d/attachment.pgp>
More information about the tahoe-dev
mailing list