[tahoe-lafs-trac-stream] [tahoe-lafs] #2123: Build intermitently-connected replication-only storage grid

Thu Dec 12 15:22:03 UTC 2013

#2123: Build intermitently-connected replication-only storage grid
-----------------------------+------------------------
     Reporter:  amontero     |      Owner:  daira
         Type:  enhancement  |     Status:  new
     Priority:  normal       |  Milestone:  undecided
    Component:  unknown      |    Version:  1.10.0
   Resolution:               |   Keywords:  sneakernet
Launchpad Bug:               |
-----------------------------+------------------------

Old description:

> I'm trying to achieve the following Tahoe-LAFS scenario:
>
> === Assumptions ===
> * I'm the only grid administrator. Introducer, clients and storage nodes
> are all administered by me. However, not all of them are privacy safe, so
> Tahoe-LAFS provides granular accessibility to allowed files and privacy
> for the remaining files. Nice!
> * I'm the only grid uploader, no other users will be able to upload. I'll
> give them readonly caps. Think of it as a friendnet with a gatekeeper.
> * Nodes are most of the time isolated/offline from each other. This can
> be because of no internetworking connectivity or because a localnet
> Tahoe-in-a-box device is powered off.
> * Storage nodes can see/connect each other only at certain times during
> limited periods of time (rendezvous).
>
> === Requirements ===
> * All storage nodes should hold all the shares in the grid in order to
> provide desired reliability and offline-from-the-grid access to all
> files.
> * No wasted space. Here, "wasted space" is defined as "shares in excess
> of necessary to read the file locally" (>k). We want only to hold shares
> enough to have a full local replica of the grid readable, not any more.
> * To increase reliability/redundancy, we add more full-grid-replica nodes
> and repair. But each storage node should hold the entire grid on its own
> to be able to read from it offline. No node will know/needs to know how
> many other storage nodes exist, just get in contact with one of them from
> time to time.
>
> === Proposed setup ===
> * Configure a grid with k=1, h=1, N=2
> * Create cron or manual job to be run when nodes rendezvous. This job
> will be a deep-repair to ensure that nodes having new shares replicate to
> nodes still not holding them.
>
> With this setup, the process would be as follows:
> 1. A "tahoe backup" is run against a locally reachable, disconnected from
> grid storage node. h=1 achieves "always happy" successful uploads. k=1
> just is the simplest value, no stripping is desired. This step backups
> files into the grid by placing one share in the local storage node.
> Backup done.
> 2. Later, another node comes online/gets reachable. Either via cronjob or
> manual run, now it's time for the grid to achieve redundancy. No
> connectivity scheduling: we don't know when we'll see that node again. We
> run a deep-repair operation from any node. Having N=2 and only one share
> in the most up-to-date backup node, the arriving node would receive
> another share for each file it didn't knew previously. Replication done.
>
> === Current problem ===
> Share placement as of it is today, cannot guarantee that no share is
> wrongly placed in a node where there is already one. With a k=1/N=2, if a
> cron triggered repair is run when node is isolated, we would be wasting
> space, since the local node already holds shares enough to retrieve the
> whole file.
>
> Worse still: one single node holding both of N shares would prevent
> arriving nodes to get their replicas (their own local share), since
> repairer would be satisfied with both shares being present in the grid,
> even in the same node. This could lead to shares never achieving any
> replication outside of the creator node, creating a SPOF for data.
>
> === Proposed solution ===
> Add server-side configuration option in storage nodes to make them gently
> reject holding shares in excess of k. This would address space wasting.
> Also, since local storage node would refuse to store an extra/unneeded
> share, a new storage node arriving would get the remaining share at
> repair time to fulfill the desired N, thus achieving/increasing
> replication.
>
> Current/future placement improvements can't be relied on to be able to
> achieve this easily and, since look like it's more of a [storage] server-
> side policy, it's unlikely. At least, as far as I'm currenly able to
> understand share placement now or how that could even be achieved with
> minimal/enough guarantee (sometimes this gets quantum-physics to me). I
> think it's too much to rely on upload client behavior/decisions and they
> will have very limited knowledge window of the whole grid, IMO.
>
> Apart from the described use case, this setting would be useful in other
> scenarios where the storage node operator should exercise some control
> for other reasons.
>
> I've discussed this scenario already with Daira and Warner to ensure that
> the described solution would work for this scenario. As per Zooko's
> suggestion, I've done this writeup to allow some discussion before
> jumping into coding in my own branch as the next step. That's in a
> separate ticket (#2124), just to keep feature specs and implementation
> separate from this single use case, since I think other scenarios might
> come up that could benefit from implementing the proposed solution.
>
> I've also collected implementation details while discussing this with
> Daira and Warner ~~
> but I'll leave that for the followup ticket~~ that can also be found at
> #2124.
>
> Anyone else interested in this scenario? Suggestions/imporvements?
>
> For more info, this ticket is a subset of #1657

New description:

 I'm trying to achieve the following Tahoe-LAFS scenario:

 === Assumptions ===
 * I'm the only grid administrator. Introducer, clients and storage nodes
 are all administered by me. However, not all of them are privacy safe, so
 Tahoe-LAFS provides granular accessibility to allowed files and privacy
 for the remaining files. Nice!
 * I'm the only grid uploader, no other users will be able to upload. I'll
 give them readonly caps. Think of it as a friendnet with a gatekeeper.
 * Nodes are most of the time isolated/offline from each other. This can be
 because of no internetworking connectivity or because a localnet Tahoe-
 in-a-box device is powered off.
 * Storage nodes can see/connect each other only at certain times during
 limited periods of time (rendezvous).

 === Requirements ===
 * All storage nodes should hold all the shares in the grid in order to
 provide desired reliability and offline-from-the-grid access to all files.
 * No wasted space. Here, "wasted space" is defined as "shares in excess of
 necessary to read the file locally" (>k). We want only to hold shares
 enough to have a full local replica of the grid readable, not any more.
 * To increase reliability/redundancy, we add more full-grid-replica nodes
 and repair. But each storage node should hold the entire grid on its own
 to be able to read from it offline. No node will know/needs to know how
 many other storage nodes exist, just get in contact with one of them from
 time to time.

 === Proposed setup ===
 * Configure a grid with k=1, h=1, N=2
 * Create cron or manual job to be run when nodes rendezvous. This job will
 be a deep-repair to ensure that nodes having new shares replicate to nodes
 still not holding them.

 With this setup, the process would be as follows:
 1. A "tahoe backup" is run against a locally reachable, disconnected from
 grid storage node. h=1 achieves "always happy" successful uploads. k=1
 just is the simplest value, no stripping is desired. This step backups
 files into the grid by placing one share in the local storage node. Backup
 done.
 2. Later, another node comes online/gets reachable. Either via cronjob or
 manual run, now it's time for the grid to achieve redundancy. No
 connectivity scheduling: we don't know when we'll see that node again. We
 run a deep-repair operation from any node. Having N=2 and only one share
 in the most up-to-date backup node, the arriving node would receive
 another share for each file it didn't knew previously. Replication done.

 === Current problem ===
 Share placement as of it is today, cannot guarantee that no share is
 wrongly placed in a node where there is already one. With a k=1/N=2, if a
 cron triggered repair is run when node is isolated, we would be wasting
 space, since the local node already holds shares enough to retrieve the
 whole file.

 Worse still: one single node holding both of N shares would prevent
 arriving nodes to get their replicas (their own local share), since
 repairer would be satisfied with both shares being present in the grid,
 even in the same node. This could lead to shares never achieving any
 replication outside of the creator node, creating a SPOF for data.

 === Proposed solution ===
 Add server-side configuration option in storage nodes to make them gently
 reject holding shares in excess of k. This would address space wasting.
 Also, since local storage node would refuse to store an extra/unneeded
 share, a new storage node arriving would get the remaining share at repair
 time to fulfill the desired N, thus achieving/increasing replication.

 Current/future placement improvements can't be relied on to be able to
 achieve this easily and, since look like it's more of a [storage] server-
 side policy, it's unlikely. At least, as far as I'm currenly able to
 understand share placement now or how that could even be achieved with
 minimal/enough guarantee (sometimes this gets quantum-physics to me). I
 think it's too much to rely on upload client behavior/decisions and they
 will have very limited knowledge window of the whole grid, IMO.

 Apart from the described use case, this setting would be useful in other
 scenarios where the storage node operator should exercise some control for
 other reasons.

 I've discussed this scenario already with Daira and Warner to ensure that
 the described solution would work for this scenario. As per Zooko's
 suggestion, I've done this writeup to allow some discussion before jumping
 into coding in my own branch as the next step. That's in a separate ticket
 (#2124), just to keep feature specs and implementation separate from this
 single use case, since I think other scenarios might come up that could
 benefit from implementing the proposed solution.

 I've also collected implementation details while discussing this with
 Daira and Warner ~~
 but I'll leave that for the followup ticket~~ that can also be found at
 #2124.

 Anyone else interested in this scenario? Suggestions/imporvements?

 For more info, this ticket is a subset of #1657. See also related issues:
 #793 and #1107.

--

Comment (by amontero):

 Linking to related issues: #793, #1107, #1657 in issue summary.

-- 
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/2123#comment:6>
tahoe-lafs <https://tahoe-lafs.org>
secure decentralized storage