[tahoe-lafs-trac-stream] [tahoe-lafs] #793: using removable disk as a storage backend

tahoe-lafs trac at tahoe-lafs.org
Thu Dec 12 15:09:05 UTC 2013


#793: using removable disk as a storage backend
-------------------------+-------------------------------------------------
     Reporter:  warner   |      Owner:
         Type:           |     Status:  new
  enhancement            |  Milestone:  undecided
     Priority:  major    |    Version:  1.5.0
    Component:  code-    |   Keywords:  bandwidth performance migration
  storage                |  placement preservation storage removable
   Resolution:           |  backend sneakernet
Launchpad Bug:           |
-------------------------+-------------------------------------------------
Changes (by amontero):

 * keywords:
     bandwidth performance migration placement preservation storage
     removable backend
     =>
     bandwidth performance migration placement preservation storage
     removable backend sneakernet


Old description:

> For years, on my linux box, I've used an encrypted USB flash drive for my
> GPG
> keys, SSH keys, future winning lottery numbers that I've written down
> after a
> session at my secret time portal, etc. It's a two- or three- factor
> scheme:
> you need to possess the drive (which stays in my pocket), and get into my
> computer, and sometimes into my head too.
>
> But I'm not very confident in the linux encrypted-disk schemes, and they
> generally don't provide much integrity checking. And, one of these USB
> drives
> has started to fail. On the other hand, I *am* confident in Tahoe's
> encryption, and integrity checking, and, hey, waitaminute..
>
> So what I'm thinking is that some aspects of #467 (specifically the
> creation
> of alternate backends) would enable a "local storage" configuration:
> shares
> are stored in a locally-designated directory instead of out on the
> network.
> Each "server" could correspond to a different removeable drive. I'd
> probably
> put k+1 shares on each drive, use two or three drives, and keep at least
> one
> of them in a safe offline place. Having k+1 per drive might let me
> tolerate a
> bad sector without having to go to the safe-deposit box. When a drive
> starts
> failing, I mount a new one in the same place and hit "Repair". Mutable
> updates would only change the shares on the drive in my pocket, until the
> next time I fetch the safe-deposit-box drive and do a Repair (at which
> point
> those shares will be updated to the latest versions).
>
> The amount of data is usually quite small compared to the size of the
> drive.
> I probably have about 10kB of data to keep safe, and a 4GB thumb drive to
> store it on. So this can afford to use k=1 and a high expansion ratio.
>
> Some wrinkles to figure out:
>
>  * it would be a nuisance to actually mount all the drives at the same
> time.
>    It might be useful to configure a "staging area", on a temporary
> ramdisk
>    (since the whole exercise is to maintain the two-factor requirement:
>    removable drive with the shares, plus the rootcap). Then tahoe would
>    encode to the staging area, and you could copy shares to the USB drive
>    later. Or maybe mount two drives at a time, and tell the Repairer to
> only
>    create certain shares (instead of also creating the shares for the
> missing
>    drive and putting them in the wrong place), and using the Repairer
>    multiple times.
>  * the storage backend would store the share contents directly to disk,
>    instead of wrapping them in the usual "container" format (since we
> don't
>    need leases, or write-enablers)
>  * the backend code would need to correctly interpret the lack of a
> readable
>    path (i.e. the removable drive being removed) as the "server" being
>    offline, and look for a different one
>  * since many systems will mount removable media at a fixed location, it
>    might be useful to define the "server id" by writing a special file to
> the
>    removable drive (sort of like the regular disk UUID). When the tahoe
>    backend code looks to see if a "server" is "online", it looks for this
>    serverid file to decide which server is actually available.
>  * and, of course, this would be easier to use with a good FUSE frontend.
>    Most of the time I use a hand-built "keycache" program (which copies
> the
>    data into a ramdisk and erases it after a timeout), for which a simple
>    "tahoe get" would be sufficient. But sometimes I want programs to read
> out
>    that data directly, for which I'd need FUSE.

New description:

 For years, on my linux box, I've used an encrypted USB flash drive for my
 GPG
 keys, SSH keys, future winning lottery numbers that I've written down
 after a
 session at my secret time portal, etc. It's a two- or three- factor
 scheme:
 you need to possess the drive (which stays in my pocket), and get into my
 computer, and sometimes into my head too.

 But I'm not very confident in the linux encrypted-disk schemes, and they
 generally don't provide much integrity checking. And, one of these USB
 drives
 has started to fail. On the other hand, I *am* confident in Tahoe's
 encryption, and integrity checking, and, hey, waitaminute..

 So what I'm thinking is that some aspects of #467 (specifically the
 creation
 of alternate backends) would enable a "local storage" configuration:
 shares
 are stored in a locally-designated directory instead of out on the
 network.
 Each "server" could correspond to a different removeable drive. I'd
 probably
 put k+1 shares on each drive, use two or three drives, and keep at least
 one
 of them in a safe offline place. Having k+1 per drive might let me
 tolerate a
 bad sector without having to go to the safe-deposit box. When a drive
 starts
 failing, I mount a new one in the same place and hit "Repair". Mutable
 updates would only change the shares on the drive in my pocket, until the
 next time I fetch the safe-deposit-box drive and do a Repair (at which
 point
 those shares will be updated to the latest versions).

 The amount of data is usually quite small compared to the size of the
 drive.
 I probably have about 10kB of data to keep safe, and a 4GB thumb drive to
 store it on. So this can afford to use k=1 and a high expansion ratio.

 Some wrinkles to figure out:

  * it would be a nuisance to actually mount all the drives at the same
 time.
    It might be useful to configure a "staging area", on a temporary
 ramdisk
    (since the whole exercise is to maintain the two-factor requirement:
    removable drive with the shares, plus the rootcap). Then tahoe would
    encode to the staging area, and you could copy shares to the USB drive
    later. Or maybe mount two drives at a time, and tell the Repairer to
 only
    create certain shares (instead of also creating the shares for the
 missing
    drive and putting them in the wrong place), and using the Repairer
    multiple times.
  * the storage backend would store the share contents directly to disk,
    instead of wrapping them in the usual "container" format (since we
 don't
    need leases, or write-enablers)
  * the backend code would need to correctly interpret the lack of a
 readable
    path (i.e. the removable drive being removed) as the "server" being
    offline, and look for a different one
  * since many systems will mount removable media at a fixed location, it
    might be useful to define the "server id" by writing a special file to
 the
    removable drive (sort of like the regular disk UUID). When the tahoe
    backend code looks to see if a "server" is "online", it looks for this
    serverid file to decide which server is actually available.
  * and, of course, this would be easier to use with a good FUSE frontend.
    Most of the time I use a hand-built "keycache" program (which copies
 the
    data into a ramdisk and erases it after a timeout), for which a simple
    "tahoe get" would be sufficient. But sometimes I want programs to read
 out
    that data directly, for which I'd need FUSE.

--

-- 
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/793#comment:3>
tahoe-lafs <https://tahoe-lafs.org>
secure decentralized storage


More information about the tahoe-lafs-trac-stream mailing list