Opened at 2009-12-19T22:41:02Z
Last modified at 2014-09-24T04:53:50Z
#864 new enhancement
Automated migration of shares between storage servers
Reported by: | kpreid | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | undecided |
Component: | code-storage | Version: | 1.5.0 |
Keywords: | preservation accounting leases anti-censorship | Cc: | kpreid, amontero@… |
Launchpad Bug: |
Description (last modified by Lcstyle)
There should be a way to have a storage server automatedly move some or all of its shares to other storage servers. Right now this can be done only as a tedious manual process.
Use cases:
- A storage server is being decommissioned; the shares would otherwise be entirely lost.
- A storage server needs some disk space freed up for other purposes.
If it finds that another storage server has the same share (can be caused by repair), it could just be deleted locally.
Refinement:
Furthermore, since the storage server knows the server selection algorithm, it can choose to distribute particular shares to storage servers which occur early in the permuted peer order, so that the time to retrieve the shares will be better on average than it would have been if all this server's shares were moved uniformly to an arbitrarily-selected server.
In fact, if this cleverness is implemented it might be worth doing under normal conditions, to redistribute shares uploaded to the "wrong" peers because the right (early-searched) ones were unreachable at the time. (The interaction of this with #573 would need consideration.)
Change History (7)
comment:1 Changed at 2009-12-20T20:30:33Z by davidsarah
- Keywords preservation added
comment:2 Changed at 2009-12-22T05:52:49Z by warner
comment:3 Changed at 2009-12-22T19:03:11Z by davidsarah
Closing #481 (building some share-migration tools) as a duplicate of this. Its description was:
Zandr and I were talking about what sorts of tools we'd like to have available when it comes time to move shares from one disk to another.
The Repairer is of course the first priority, and should be able to handle share loss, but there are some techniques we might use to make things more efficient: using shares that already exist instead of generating new ones.
If we have a large disk full of shares that has some problems (bad blocks, etc), we should be able to dd or scp off the shares to another system. This wants a tool that will try to read a file (skipping it if we get io errors), verify as much of it as we can (seeing if the UEB hash matches), then sending it over the network to somewhere else.
If a disk is starting to fail (we've seen SMART statistics, or we're starting to see hash failures in the shares we return, etc), then we might want to kick the disk into "abandon ship" mode: get all shares off the disk (and onto better ones) as quickly as possible. The server could do the peer-selection work and ask around and find the "right" server for each share (i.e. the first one in the permuted order that doesn't already have a share), or it could just fling them to a "lifeboat" node and leave the peer-selection work until later.
Repair nodes should have a directory where we can dump shares that came from other servers: the repair node should treat that directory as a work queue, and it should find a home for each one (or discard it as a duplicate). The repair node needs to be careful to not treat abandon-ship nodes as suitable targets, so we can avoid putting shares back on the server that was trying to get rid of them.
It might also be useful to split up a storage server, or to take a functional server and export half of its shares in a kind of rebalancing step.
comment:4 Changed at 2009-12-22T19:29:42Z by davidsarah
- Keywords accounting leases added
comment:5 Changed at 2010-12-16T01:22:50Z by davidsarah
- Keywords anti-censorship added
comment:6 Changed at 2012-03-04T19:01:52Z by amontero
- Cc amontero@… added
comment:7 Changed at 2014-09-24T04:44:23Z by Lcstyle
- Description modified (diff)
This is closely related to #699 (and the other tickets that one references). Each storage server should know about the full server list, and peer-selection is a function of the server list and the storage-index, so all servers should be able to independently come up with the "right places" for a given share, and could act on their own volition to move the share into the right place (making life easier for the eventual downloader).
Ideally, a really lazy uploader who dumped all N shares on a single server should find that, eventually, their shares were redistributed to the "right" places.
There are a few wrinkles:
But, in general, I like the idea, and it'd be nice if servers were involved in an automatic maintenance process.