Opened at 2013-11-30T20:13:42Z
Last modified at 2013-12-12T18:19:40Z
#2123 new enhancement
Build intermitently-connected replication-only storage grid — at Version 2
Reported by: | amontero | Owned by: | daira |
---|---|---|---|
Priority: | normal | Milestone: | undecided |
Component: | unknown | Version: | 1.10.0 |
Keywords: | sneakernet space-efficiency | Cc: | |
Launchpad Bug: |
Description (last modified by amontero)
I'm trying to achieve the following Tahoe-LAFS scenario:
Assumptions
- I'm the only grid administrator. Introducer, clients and storage nodes are all administered by me. However, not all of them are privacy safe, so Tahoe-LAFS provides granular accessibility to allowed files and privacy for the remaining files. Nice!
- I'm the only grid uploader, no other users will be able to upload. I'll give them readonly caps. Think of it as a friendnet with a gatekeeper.
- Nodes are most of the time isolated/offline from each other. This can be because of no internetworking connectivity or because a localnet Tahoe-in-a-box device is powered off.
- Storage nodes can see/connect each other only at certain times during limited periods of time (rendezvous).
Requirements
- All storage nodes should hold all the shares in the grid in order to provide desired reliability and offline-from-the-grid access to all files.
- No wasted space. Here, "wasted space" is defined as "shares in excess of necessary to read the file locally" (>k). We want only to hold shares enough to have a full local replica of the grid readable, not any more.
- To increase reliability/redundancy, we add more full-grid-replica nodes and repair. But each storage node should hold the entire grid on its own to be able to read from it offline. No node will know/needs to know how many other storage nodes exist, just get in contact with one of them from time to time.
Proposed setup
- Configure a grid with k=1, h=1, N=2
- Create cron or manual job to be run when nodes rendezvous. This job will be a deep-repair to ensure that nodes having new shares replicate to nodes still not holding them.
With this setup, the process would be as follows:
- A "tahoe backup" is run against a locally reachable, disconnected from grid storage node. h=1 achieves "always happy" successful uploads. k=1 just is the simplest value, no stripping is desired. This step backups files into the grid by placing one share in the local storage node. Backup done.
- Later, another node comes online/gets reachable. Either via cronjob or manual run, now it's time for the grid to achieve redundancy. We run a deep-repair operation from any node. Having N=2 and only one share in the most up-to-date backup node, the arriving node would receive another share for each file it didn't knew previously.
Current problem
Share placement as of it is today, cannot guarantee that no share is wrongly placed in a node where there is already one. With a k=1/N=2, if a cron triggered repair is run when node is isolated, we would be wasting space, since the local node already holds shares enough to retrieve the whole file.
Worse still: one single node holding both of N shares would prevent arriving nodes to get their replicas (their own local share), since repairer would be satisfied with both shares being present in the grid, even in the same node. This could lead to shares never achieving any replication outside of the creator node, creating a SPOF for data.
Proposed solution
Add server-side configuration option in storage nodes to make them gently reject holding shares in excess of k. This would address space wasting. Also, since local storage node would refuse to store an extra/unneeded share, a new storage node arriving would get the remaining share at repair time to fulfill the desired N, thus achieving/increasing replication.
Current/future placement improvements can't be relied on to be able to achieve this easily and, since look like it's more of a [storage] server-side policy, it's unlikely. At least, as far as I'm currenly able to understand share placement now or how that could even be achieved with minimal/enough guarantee (sometimes this gets quantum-physics to me). I think it's too much to rely on upload client behavior/decisions and they will have very limited knowledge window of the whole grid, IMO.
Apart from the described use case, this setting would be useful in other scenarios where the storage node operator should exercise some control for other reasons.
I've discussed this scenario already with Daira and Warner to ensure that the described solution would work for this scenario. As per Zooko's suggestion, I've done this writeup to allow some discussion before jumping into coding in my own branch as the next step. That would be in a separate ticket, just to keep feature specs and implementation separate from this single use case, since I think other scenarios might come up that could benefit from implementing the proposed solution.
I also collected implementation details while discussing this with Daira and Warner but I'll leave that for the followup ticket.
Anyone else interested in this scenario? Suggestions/imporvements?
For more info, this ticket is a subset of #1657
Change History (2)
comment:1 Changed at 2013-11-30T20:35:16Z by amontero
- Description modified (diff)
comment:2 Changed at 2013-11-30T20:48:51Z by amontero
- Description modified (diff)