#1107 new enhancement

"sneakernet" servers

Reported by: warner Owned by:
Priority: major Milestone: undecided
Component: code-storage Version: 1.7.0
Keywords: bandwidth performance migration preservation storage backend sneakernet Cc: amontero@…
Launchpad Bug:

Description (last modified by amontero)

"Never underestimate the bandwidth of a station wagon filled with 9-track tapes."

Zandr and I were cooking up a high-volume low-bandwidth backup scheme the other day, to manage our large digital-photo libraries (after a shoot, we'll typically add 4-8GB of image files to the library, and update 10-100MB of metadata DB files). A lot of these changes are append-only. Uploading this much data over just a DSL line can take days or weeks. Using only the network is convenient but somewhat painful.

We were sketching out a Git-based scheme, since all file-synchronization problems are really version-control problems, and because Git has some tools to create "packfiles" which contain a compressed form of all the data needed to get from version A to version B. The idea was to then put these packfiles on portable drives, and carry them from one machine to another, and then let a process on the receiving machine incorporate the packfile into the second copy of the archive.

But, it'd be nice if you could use Tahoe for this. The simplest use-case would be a backup grid that has just one server node, and k=N, so you're uploading all the shares to the same place. Imagine that the server is at your office or some other place where you visit every day, and that the two machines have network connectivity but it's relatively slow compared to the amount of data you want to back up.

Then you'd configure the client with a local directory that gets associated with the remote server. That local directory would actually live on a removeable drive. When the client creates a share to send to that server, it actually just writes it to the drive. A separate process slowly uploads shares from the drive to the server, using whatever bandwidth is available, removing them from the drive when it finishes, so if you just wait long enough, you'll get the same share distribution as without this change.

But, when you leave for work in the morning, you unplug the drive and bring it along with you. When you arrive at the office, you plug the drive into the server machine, which notices it and starts copying shares off, deleting them as it goes. At the end of the day, you unplug the drive and bring it home, to repeat the process. A cheap 8GB flash drive used this way will achieve an average throughput of 740kbps, which is better upstream bandwidth than most high-end DSL lines, and a cheap 100GB external HD swapped daily provides 10Mbps.

The drive behaves like a "mail bag" that always moves back and forth, carrying as much data as can fit, or sometimes being empty when there's no work to be done. It's a layer-4 protocol, using humans for transport, and large removable drives as packets, with transmission control managed by the computers at either end.

The client would want some tools to store shares locally, if a backup occurred while the removeable drive was elsewhere, and then detect it coming back and move the shares onto it. It might be a good idea to hold a copy of the shares locally until the remote server has confirmed receipt (via a set of DYHB queries over the network), and to tolerate drive failures by re-sending the shares once a failure is detected.

It might also be nice to take advantage of multiple removable drives, and to allow bidirectional exchanges, or a rotating-ring system among your coworkers houses (sneakernet-augmented friendnet: each morning you drop a USB stick on Bob's desk, and Alice drops one off on yours, and with 5 participants, all data would get to everyone else's machines within a week).

For extra credit, put LEDs next to the USB port to tell you when it's safe/useful to transfer the drive. Some USB flash drives have an e-ink display to show how full they are: hack the display to show a label that tells you where the drive needs to go next.

The configuration for this should make it easy to keep sending small shares over the network, and only use the sneakernet carriers for large files: this probably means measuring/estimating the upload bandwidth and showing the user the upload queue, measured in units of time, and sorting the small shares to the front of the list. It would also mean changing the "file has been uploaded" completion semantics, since until the shares have finished migrating to their remote homes, the file would remain vulnerable to a failure of the local host.

Change History (5)

comment:1 Changed at 2010-07-06T19:37:52Z by davidsarah

Possibly merge with ticket #793.

comment:2 Changed at 2011-08-27T01:38:07Z by davidsarah

  • Keywords bandwidth performance migration preservation storage backend added

comment:3 Changed at 2013-12-12T15:11:01Z by amontero

  • Description modified (diff)

Tagging. Linking to related #1657 and #2123.

comment:4 Changed at 2013-12-12T15:14:00Z by amontero

  • Keywords sneakernet added

comment:5 Changed at 2013-12-14T20:33:48Z by amontero

  • Cc amontero@… added
Note: See TracTickets for help on using tickets.