[tahoe-lafs-trac-stream] [tahoe-lafs] #543: 'rebalancing manager'

Wed Jul 3 09:39:00 UTC 2013

#543: 'rebalancing manager'
------------------------------+--------------------------------
     Reporter:  warner        |      Owner:
         Type:  enhancement   |     Status:  new
     Priority:  major         |  Milestone:  soon
    Component:  code-storage  |    Version:  1.2.0
   Resolution:                |   Keywords:  performance repair
Launchpad Bug:                |
------------------------------+--------------------------------

Old description:

> So, in doing a bunch of manual GC work over the last week, I'm starting
> to
> think about what a "rebalancing manager" service would look like.
>
> The basic idea is that storage servers would give a central service
> access to
> some special facet, through which the manager could enumerate the shares
> present on each one. The manager would slowly cycle through the entire
> storage-index space (over the course of a month, I imagine), probably one
> prefixdir at a time.
>
> It would ask all the storage servers about which shares they hold, and
> figure
> out which other servers also hold those shares (this query is an online
> version of the 'tahoe debug catalog-shares' CLI tool). Then it would make
> decisions about which shares ought to go where. There are two goals
> (sometimes competing). The first is to move shares closer to the start of
> the
> permuted peer-selection order, so that clients don't have to search as
> far to
> find them. The second is to smooth out disk usage among all servers (more
> by
> percentage than by absolute usage).
>
> Once the manager works out the minimum-effort rearrangement, it will
> inform
> the two servers that they should move a share between them. The servers
> can
> then use a direct connection to copy the share to its new home and then
> delete the original. In grids without full bidirectional connectivity,
> the
> manager could conceivably act as a relay.
>
> When a new (empty) disk is added to the grid, it will accumulate shares
> very
> slowly, and only get shares for new files (those which are created after
> the
> new node comes online). A rebalancing manager would make better use of
> the
> new disk: filling it with old shares too, thus freeing space on old
> servers
> so they can continue to participate in the grid (instead of being read-
> only).
>
> There may be ways to perform this task without a central manager. For
> example, we could treat balancing as an aspect of repair, such that the
> repair process ought to include moving shares around to better places. In
> this approach, the client that performs a repair would also do
> rebalancing.
> It is not clear if the clients ought to have the same level of authority
> as a
> trusted repair-manager: for example, should clients have the ability to
> delete shares of immutable files? Making the clients drive the
> rebalancing
> process would insure that no effort is expended on unwanted files. On the
> other hand, 1) clients must then take an active interest in rebalancing,
> and
> 2) the load generated by rebalancing would be pretty choppy (a central
> manager could do it smoothly, over time, whereas a client would want to
> finish their repair/rebalancing pass as quickly as possible).
>
> This will also interact with accounting. A privileged rebalancing manager
> could be given the authority to clone a share (account labels and all) to
> a
> new server, whereas a client performing rebalancing themselves would
> naturally be restricted to whatever storage that client was normally
> allowed
> to consume. I'm not sure whether this issue is significant or not.
>
> On the implementation side, I'd expect the rebalancing-manager to be a
> Tahoe
> node (made with 'tahoe create rebalancer', or the like), which advertises
> itself via the introducer. Storage Servers would have a configuration
> setting
> that says "give rebalancing-authority to any rebalancer that is
> advertised
> with a signature from blesser key X". This would require each storage
> server
> to be configured with pubkey X, but would not require any changes on the
> balancer node when new storage servers are added.
>
> It might also be a good idea to tell the rebalancer how many storage
> servers
> it should expect to see, so it can refrain from doing anything unless
> it's
> fully connected.
>
> I'm also thinking that the enumerate-your-shares interface could be used
> to
> generate estimates of how many files are in the grid. The rebalancer (or
> some
> other node with similar enumeration authority, perhaps a stats-gatherer
> or
> disk-watcher) could query for all shares in the aa-ab prefix range, merge
> the
> responses from all servers, then multiply by the number of prefixes. If
> the
> servers could efficiently distinguish mutable shares from immutable
> shares,
> we could get estimates of both filetypes.

New description:

 So, in doing a bunch of manual GC work over the last week, I'm starting to
 think about what a "rebalancing manager" service would look like.

 The basic idea is that storage servers would give a central service access
 to
 some special facet, through which the manager could enumerate the shares
 present on each one. The manager would slowly cycle through the entire
 storage-index space (over the course of a month, I imagine), probably one
 prefixdir at a time.

 It would ask all the storage servers about which shares they hold, and
 figure
 out which other servers also hold those shares (this query is an online
 version of the 'tahoe debug catalog-shares' CLI tool). Then it would make
 decisions about which shares ought to go where. There are two goals
 (sometimes competing). The first is to move shares closer to the start of
 the
 permuted peer-selection order, so that clients don't have to search as far
 to
 find them. The second is to smooth out disk usage among all servers (more
 by
 percentage than by absolute usage).

 Once the manager works out the minimum-effort rearrangement, it will
 inform
 the two servers that they should move a share between them. The servers
 can
 then use a direct connection to copy the share to its new home and then
 delete the original. In grids without full bidirectional connectivity, the
 manager could conceivably act as a relay.

 When a new (empty) disk is added to the grid, it will accumulate shares
 very
 slowly, and only get shares for new files (those which are created after
 the
 new node comes online). A rebalancing manager would make better use of the
 new disk: filling it with old shares too, thus freeing space on old
 servers
 so they can continue to participate in the grid (instead of being read-
 only).

 There may be ways to perform this task without a central manager. For
 example, we could treat balancing as an aspect of repair, such that the
 repair process ought to include moving shares around to better places. In
 this approach, the client that performs a repair would also do
 rebalancing.
 It is not clear if the clients ought to have the same level of authority
 as a
 trusted repair-manager: for example, should clients have the ability to
 delete shares of immutable files? Making the clients drive the rebalancing
 process would insure that no effort is expended on unwanted files. On the
 other hand, 1) clients must then take an active interest in rebalancing,
 and
 2) the load generated by rebalancing would be pretty choppy (a central
 manager could do it smoothly, over time, whereas a client would want to
 finish their repair/rebalancing pass as quickly as possible).

 This will also interact with accounting. A privileged rebalancing manager
 could be given the authority to clone a share (account labels and all) to
 a
 new server, whereas a client performing rebalancing themselves would
 naturally be restricted to whatever storage that client was normally
 allowed
 to consume. I'm not sure whether this issue is significant or not.

 On the implementation side, I'd expect the rebalancing-manager to be a
 Tahoe
 node (made with 'tahoe create rebalancer', or the like), which advertises
 itself via the introducer. Storage Servers would have a configuration
 setting
 that says "give rebalancing-authority to any rebalancer that is advertised
 with a signature from blesser key X". This would require each storage
 server
 to be configured with pubkey X, but would not require any changes on the
 balancer node when new storage servers are added.

 It might also be a good idea to tell the rebalancer how many storage
 servers
 it should expect to see, so it can refrain from doing anything unless it's
 fully connected.

 I'm also thinking that the enumerate-your-shares interface could be used
 to
 generate estimates of how many files are in the grid. The rebalancer (or
 some
 other node with similar enumeration authority, perhaps a stats-gatherer or
 disk-watcher) could query for all shares in the aa-ab prefix range, merge
 the
 responses from all servers, then multiply by the number of prefixes. If
 the
 servers could efficiently distinguish mutable shares from immutable
 shares,
 we could get estimates of both filetypes.

--

Comment (by daira):

 #483 (repairer service) was closed as a duplicate.

-- 
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/543#comment:11>
tahoe-lafs <https://tahoe-lafs.org>
secure decentralized storage