[tahoe-lafs-trac-stream] [Tahoe-LAFS] #3022: Servers of happiness share placement distributes storage load unevenly in small grids
Tahoe-LAFS
trac at tahoe-lafs.org
Fri May 3 23:40:29 UTC 2019
#3022: Servers of happiness share placement distributes storage load unevenly in
small grids
-------------------------+------------------------------------------
Reporter: exarkun | Owner:
Type: defect | Status: new
Priority: normal | Milestone: undecided
Component: unknown | Version: 1.12.1
Resolution: | Keywords: servers-of-happiness, upload
Launchpad Bug: |
-------------------------+------------------------------------------
Comment (by ccx):
There are several related questions I have about the desired behavior:
First: Is it desirable to upload more than one share to each server and if
so, under which conditions?
Second: If so, what will be the mechanism for relocating some of the
shares from servers that have more than one once more servers become
available so that resilience is maximized? Preferably in a way that
doesn't waste space by having too many useless copies of certain shares.
Third: What will be the mechanism to optimizing space usage by removing
redundant copies of each share? Taking into account some servers might
want to hold on to extra copies as a cache for quick access, as defined by
peers.preferred currently.
By reading the source code I understood the current behavior to be:
* On upload distribute shares across servers using the SoH algorithm.
* If there are any shares left without servers assigned, just round-robin
through the server list, regardless of how many shares the server already
has.
* On check-and-repair shares are checked for readability.
* Whether all of the shares are reachable is reported as boolean
"healthy".
* Happiness is calculated and returned, but it doesn't seem to affect
anything.
* If the file is not healthy, repair is run. Repair uses the SoH algorithm
to maximize happiness and round-robins for the rest.
To me it seems that this algorithm will on files where N is close to the
size of grid will result in large amount of duplication unless every
server has flawless connectivity, with no real means of reclaiming the
space once the grid recovers from a partition.
Moreover, uploading more than max(k, SoH) shares seems pointless to me if
it will result in more than one share per server. Frankly I'm not really
sure whether the practice of uploading more than once to a server matches
the usage model for Tahoe-LAFS at all.
Thinking about how to redistribute shares I've come to conclusion that
it's probably best if the node initiating the upload/check-and-repair is
also in charge of the reallocation. If there was a way for node to ask for
content to be moved to them it would open (or rather widen) the
possibility for malicious actor with introducer access to create
DoS/dataloss by creating lot of nodes hoping to get to top of the permuted
list and then throwing the data away. It would also mean that all nodes
need to agree on the algorithm for the permuted list and which nodes are
trusted to hold the data, rather than just the uploader deciding.
I've been thinking of possible improvements of the garbage collection
mechanism (mainly in terms of quick space reclamation) and incorporating
explicit mapping of shares onto specific servers through updateable data
structure might be the way to address the issue of data relocation too.
--
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/3022#comment:2>
Tahoe-LAFS <https://Tahoe-LAFS.org>
secure decentralized storage
More information about the tahoe-lafs-trac-stream
mailing list