rootcap-recovery authority-graph tool

Tue Dec 6 21:37:49 UTC 2016

In today's devchat, I described a tool I want to build to manage
rootcaps. In particular, I'm thinking about how to express the
conjunction of distinct factors (2FA/3FA/MFA/etc) that must be acquired
to get back to a tahoe directory (rootcap).

For example, if you're using tahoe as a password manager, you might want
access to depend upon:

* being on your home computer, OR your laptop
* AND having a specific Yubikey present
* OR knowing a secret estate-planning key that stays in a safety-deposit
  box or with a lawyer so your heirs can get to it

In boolean terms, that looks like:

 lawyer-key OR (yubikey AND (home-key OR laptop-key))

I'd like this tool to let you draw a graph of linked nodes that make up
this "recovery map". The source nodes would be things like:

* FILE: a file, stored on a particular computer in a particular place.
  This limits access to that one computer (or someone who steals the
  file from there). The token might also be stored on a removeable
  drive, which must be inserted to proceed.
* PASSWORD: a string that gets typed in, either memorized by a human
  brain, or copied from a sealed envelope. Maybe those should be
  separate types, with different character sets or training processes.
* GPG/smartcard/Yubikey: the factor is a random token, which is then
  encrypted to a GPG public key, and the ciphertext is stored in the
  serialized map file. Later, you'll need the Yubikey present to decrypt
  and retrieve the token.
* REMOTE: some third-party service that enforces its own login criteria,
  like a google account (or Authy, or Keybase, or something). For
  Tahoe's purposes, these services must hold a secret for you, and only
  reveal it to clients which fulfill their login requirements (so bare
  OAUTH or U2F isn't really sufficient).

The graph would have intermediate nodes with boolean operators:

* AND
* OR
* k-of-N (Shamir secret splitting)

And the output nodes would be authorities:

* writecap
* readcap
* other credentials

After you draw and drag and wire up all the nodes the way you want them,
you hit the button that says "Compile", and the tool would:

* generate new output-node authorities, or prompt you to paste them in
* generate the necessary shards (for Shamir secret splitting), or
  combined keys for the AND nodes
* present a UI to print or transfer or download the source factors to
  their new homes
* after all factors are safely delivered, wipe everything from memory

The transfer tool might use magic-wormhole to help deliver a
machine-specific key ("FILE") to the target machine, without leaving
temporary copies on USB drives or other places. The Yubikey node would
need some UI to help select the right GPG public key and make sure it is
usable.

Later, when you run "tahoe recover-rootcap", the tool would read the
recovery map, show you the flowchart, figure out which factors are
available locally, and prompt you to provide the missing pieces (type in
a password, insert a Yubikey, etc). Then it would derive the rootcap and
use it to mount/access the tahoe-side filesystem.

(I'm declaring a scope boundary at the point where this tool emits the
rootcap, but obviously any given application needs something on top of
that to access the files and then safely forget the rootcap afterwards.
E.g. a password manager application might read the passwords just long
enough to help the user select the right one, paste it into the target
browser, then forget the rootcap and exit).

The recovery map might need to be copied to each machine that should be
able to recover the rootcap (in addition to machine-specific factors).
The map itself doesn't grant rootcap access, but it does reveal some
private information (and gives attackers specific instructions on which
machines they need to steal or compromise), so you probably don't want
to make it public. It'd be nice to somehow express this "necessary but
not sufficient" property in the graph: maybe an input node named
"recovery map" (that is a necessary input for the Yubikey nodes), and an
additional output node named "privacy" which the map is linked to.

Ideally, you should be able to configure this to get full recovery from
just a Yubikey, or just a password (if that's what you want). The goal
is to be able to take a blank laptop, a Debian install CD, and your
previously-chosen portable factors, and get back to a full Tahoe node
with rootcap. This is pretty easy if your factors can store a kilobyte
of data (for the tahoe node config, and the ciphertext of GPG-encrypted
factors). But if they can't, we need some other place to put that data.

I'm wondering if tahoe-lafs.org could host it: we could run a small
service that accepts a single short signed+encrypted blob per GPG key,
retrievable with a signed request (maybe the retrieval request should
include a transfer encryption key, or maybe we just rely upon the
original encryption). Something in the tahoe-recover-rootcap code would
know how to get the GPG public key from the yubikey, queries the
service, decrypts the ciphertext, unpacks the recovery map, then
incorporates the source factor into whatever else the map requires.

Does anyone know if you can load a Yubikey with a GPG key, then move to
a different computer, then find out what the protected key's public
fingerprint is? (If smartcards hold entire public keys, we could
probably stuff something into a comment field).

This looks like a job for the One True Grid, of course, if we can
squeeze a readcap into the portable factor. And if the One True Grid
existed :-).

Another idea we had was a smartphone-based 2FA application as a source
factor. When you compile the map, this node displays a QR code (or
magic-wormhole code) that the smartphone app can read, and pairs a key
over to it. During recovery, to satisfy this node, the app must be
opened and a new QR code scanned (which binds the reveal of the key to
the specific session being opened), which then pairs the key back to the
host. The protocol needs to guard against an MitM snooping your
recovery-time QR code (e.g. the host should remember some secondary key,
and the app should encrypt the factor with that host key, in addition to
the PAKE-based QR code).

One other feature idea was to have e.g. 2-of-4 factors for readcap
access, but 3-of-4 for writecap. This is easy to do if the writecap
required all N shards (e.g. 2-of-4 readcap, 4-of-4 writecap: just make a
separate AND node with all the shards, which means hash all the shards
together and wrap the writecap with the resulting key). But to allow any
k-sized subset to retrieve it needs more work. There might be some
clever Shamir trick to do this directly, but meejah suggested just
doubling the size of the shards: you make a batch of 4 "A" shards (using
2-of-4), and a batch of 4 "B" shards (using 3-of-4), you wrap the
readcap with the A key and the writecap with the B key, then you build
the first output "shard" by concatenating A[0]+B[0], the second with
A[1]+B[1], etc.

And, this wouldn't need to be limited to tahoe rootcaps: you could wrap
other credentials or keys this way. But at the output end, you want to
use the wrapped key for some specific purpose, and you don't want to
write it to disk or display it to a terminal (where it might get
logged), because then the disk or the terminal log turns into a giant
backdoor "OR" node that you weren't intending. So it's probably worth
making this tool be slightly application-specific, to protect the final
secret properly.

thoughts?
 -Brian