[tahoe-dev] #466 state-of-the-patch

Sat Feb 12 13:10:27 PST 2011

Here's a comment I just attached to #466 (signed-extensible introducer
announcements). I'm looking for design review and help in finishing up
this project (which has been going on for at least 3 years.. it'd be
awfully nice to wrap it up :-).

Please tell me what you think!
 -Brian

'''State of the patch'''

I'd say this project is about 80% complete. The patch currently attached
for review is split up into three pieces. This note is to explain what
those pieces do, what the overall design is like, and what's left to
design/build.

== Why Do We Want This? ==

The goal of this ticket is add signatures to our introducer
announcements, and to make them extensible. Current announcements are a
fixed 6-tuple: (FURL, service_name, remote_interface_name, nickname,
version, oldest_supported). The new announcements will be an arbitrary
JSON-serializable dictionary, embedded in a 3-tuple of (ann_json,
signature, pubkey), where the sig/pubkey are optional.

The utility of extensibility is pretty obvious: there are additional
services and features we could enable (or make more efficient) if we
could safely advertise them ahead of time through the introducer. In
general, extensible protocol formats (dicts, not tuples) improves
flexibility and enables change, since it's awfully difficult to make
changes to all nodes simultaneously. Flexible announcements aren't
strictly necessary: we could instead e.g. add new methods to the
RIStorageServer object, and have clients speculatively attempt to invoke
them, and gently tolerate NameErrors, but that seems inelegant, and
requires a roundtrip: by putting slowly-changing things in the
announcements, clients learn about them earlier.

The value of adding signatures is great, but not immediate. Signatures
would bind the contents of an announcement to some "serverid",
preventing other parties (other grid members, or the Introducer itself)
from forging those contents. This would turn the introduction system
into a secure channel from publisher to subscriber, indexed by serverid.

Without signatures, anybody in the grid can publish anything they like,
such as a record with Alice's real storage-server FURL but with a
nickname of "Bob", which would currently replace Alice's real
announcement.

Currently we use Foolscap Tub IDs (i.e. hash of the tub certificate,
which appears in each FURL) as server IDs. The only way to verify
possession of the corresponding secret is to connect to a FURL that uses
this tubid: Foolscap ensures that the object you connect to (and
subsequent send callRemote messages to) is selected by the
secret-holder. This requires an online check, whereas a signed message
could be verified offline.

We currently use serverids for three things:

 * to distinguish between storage servers who wish to be treated
   separately, sending separate shares to each
 * as a stable long-term seed for the permuted peerlist, used to decide
   how to distribute shares of each file
 * to calculate several shared secrets: the mutable-file
   "write-enabler", and the renew/expire lease tokens. These need to be
   different for each server, so that when a client exercises their
   authority on server A, that doesn't enable A to exercise the client's
   corresponding authority on server B

In the future, we would like to also use serverids to:

 * enable explicit server selection: tahoe.cfg could list the serverids
   that uploads will use, ignoring all others, to protect the user's
   upstream bandwidth and reliability choices. This selection could be
   delegated to a central party, allowing the server list to change over
   time without constant user involvement.
 * correlate reciprocal Accounting relationships

I *think* that a fully distributed introducer requires announcement
signatures. With a single central Introducer, we could achieve some
measure of control over the grid by restricting publishing access to
certain servers. But with a highly distributed log-flood -based
introduction system, we'd give up central control, and I think
individually-traceable announcements would make up for that loss.

Finally, a long-term goal is to move away from Foolscap to an HTTP-based
protocol that is easier to implement in non-Python languages, to
facilitate multiple implementations of the Tahoe protocol. Without
Foolscap, we'll need a different mechanism to securely identify a
server, for which the announcement signing keypair is appropriate.

== patch 1: python-ecdsa ==

https://github.com/warner/python-ecdsa/ is where I maintain a
pure-python ECDSA library, with a not-too-bad API. I think I may want to
make some changes to the API still. It's fast enough for use by signed
announcements, since sign/verify operations occur only once per server.
Once pycryptopp acquires ECDSA support, we should move to that, for the
30x speedup.

The library is embedded into {{{allmydata/src/util/ecdsa/*.py}}}, rather
than being added as a dependency, for two reasons. First, extra
dependencies are really making packagers lives difficult, slowing
packaging efforts and thus hurting adoption. Second, changes to the
upstream API will be easier to accomodate by using a fixed version of
python-ecdsa, so I think it makes sense to copy it wholesale into the
tahoe tree until the API stabilizes (which needs to be driven by using
it and learning what works and what doesn't).

== patch 2: keypair generation ==

This adds {{{tahoe admin generate-keypair}}} and {{{derive-pubkey}}},
basic userspace tools to work with keys. The basic idea is that some day
you might use them to creates a value that you then paste into a
tahoe.cfg file. They are currently unused.

== patch 3: everything else ==

I'll split this into:

 * terminology
 * V2 introducer protocol
 * announcement signatures
 * backwards compatibility with V1 protocol
 * serverid computation

=== Terminology ===

The {{{IntroducerServer}}} is an object that lives in the introducer
process, the one (just one, so far, but #68 will change that) identified
by the {{{introducer.furl}}}. Most of this note calls this the "server".
The {{{IntroducerClient}}} is an object that lives in each
non-introducer process, both tahoe storage servers and tahoe clients,
which manages the connection to the {{{IntroducerServer}}}.

The "publisher" is a Tahoe node (usually a storage server) which wants
to broadcast information about themselves to the whole grid. Each time
they do this, they are said to "announce" their information. The bundle
of information is called an "announcement".

The "subscriber" is a Tahoe node (usually a client/gateway) that wants
to receive announcements.

Subscribers call their local {{{IntroducerClient.subscribe_to}}} to sign
up to hear announcements, and provide a callback that will be invoked
multiple times as they arrive. This provokes the {{{IntroducerClient}}}
to send a "subscribe" message to the server. Later, the server will send
an "announce" message to the client, and the {{{IntroducerClient}}} will
fire the callback.

Publishers call {{{IntroducerClient.publish}}} to deliver announcements,
which provokes the {{{IntroducerClient}}} to send a "publish" message to
the server, which provokes the server to send "announce" messages to all
interested clients.

=== V2 Introducer Protocol ===

The old V1 protocol used 6-tuples as announcements: (FURL, service_name,
remoteinterface_name, nickname, my_version, oldest_supported). The new
V2 protocol uses an open-ended JSON-serializable dictionary, with a
number of top-level keys that are expected to be present.

The V2 client->server "publish" message adds a "canary" argument, which
allows the server to detect when the publisher has disconnected. This is
unused so far, but the intent is to display liveness status on the
server's "introweb" page, and to let them stop publishing data for
servers which are offline (or perhaps have remained offline for several
days).

The V2 client->server "subscribe" message adds a "subscriber_info"
argument, which lets the server's introweb page show information about
each subscriber (mostly nickname and version). In the V1 protocol, this
was accomplished by having each subscriber also "announce" a special
"stub_client" service, which didn't correspond to a real service, but
included enough information to build the status display. The V2
subscriber_info field is defined (by foolscap schema) to be a dictionary
with string keys, with at least "nickname", "my-version",
"app-versions", and "oldest-supported".

=== Announcement Signatures ===

V2 announcements on the wire are a 3-tuple (ann_d_json, sig_hex,
pubkey_hex), in which the announcement is serialized with JSON. Unsigned
announcements have {{{sig_hex == pubkey_hex == None}}}.

We use 256-bit ECDSA keys (from the NIST256p curve, since that seemed to
be the most widely implemented, in openssl/nss). {{{pubkey_hex}}} is a
base16-encoded uncompressed raw binary key (TODO: versioning), the
output of {{{VerifyingKey.to_string().encode("hex")}}}. This is 512 bits
long, and neither includes an OID nor a 0x04 "uncompressed" flag byte.
Signatures are computed with SHA1 (TODO: given NIST256p, let's use
SHA256), with an algorithm that is compatible with openssl (verified by
the python-ecdsa test suite). {{{sig_hex}}} is the output of
{{{SigningKey.sign(ann_d_json.encode("utf-8")).encode("hex")}}} which
uses python-ecdsa's minimal binary string encoding (no versioning
information).

Publishers give their {{{IntroducerClient.publish}}} both an
announcement dictionary and a {{{SigningKey}}} instance (or None).
Subscribers receive an announcement dictionary and a {{{VerifyfingKey}}}
instance (which will be None if the announcement did not have a matching
valid signature, which includes both unsigned announcements,
forged/invalid signatures, and valid signatures from some different
pubkey).

=== Backwards Compatibility with V1 ===

There are four interesting V1+V2 compatibility cases, two on the
publishing half, and two on the subscribing/announcement half.

The V2 IntroducerServer provides all the same method names as the V1
server, plus additional announce_v2/subscribe_v2 methods that are only
used by V2 clients. On the V1 methods, the V2 server accepts the same
message format as the V1 server. The server seeks to hide the client
versions from each other: a V1 client receives only V1-format
announcements, and a V2 client receives only V2-format announcements,
regardless of what client version generated those announcements.

When a V1 client publishes to a V2 server, it uses the old "publish"
method name, allowing the server to detect the client's old version. The
server upconverts the V1-format announcement tuple into an unsigned
V2-format dictionary, leaving some fields empty (like
ann_d["app_versions"]={}) when necessary. It then dispatches this
V2-format announcement internally as if it was received from a real V2
client. When a V1 client subscribes to a V2 server, the old server-side
"subscribe" method wraps the remote reference in a SubscriberAdapter_v1,
which behaves just like a remote reference to a modern V2 subscriber,
but downconverts the messages to old-style V1 tuples before sending them
over the wire.

When a V2 client tries to publish an announcement, it first tries to
invoke the new "publish_v2" method, with a V2-style announcement
dictionary (maybe signed). If this callRemote fails in a way that looks
like the server does not implement publish_v2, the client concludes that
it is dealing with a V1 server. It then downconverts its annoucement to
V1-style and sends it to the old V1 "publish" method.

When a V2 client wants to subscribe, it first tries the new
"subscribe_v2" method, passing itself as the desired recipient of
"announce_v2" messages (the remote callback, in a sense). If the
"subscribe_v2" method fails, the client concludes that it's dealing with
a V1 server, and calls the old V1 "subscribe" method instead, passing a
different object that can accept V1-style announcements. This
client-side object upconverts each V1 announcement into a V2-format
dictionary before internal delivery. The client also publishes a
"stub_client" announcement, so that the V1 server can display nicknames
and version numbers of all subscribed clients.

=== Server ID Computation ===

This is the trickiest part, and ties into our eventual goals for having
signed announcements.

In the ideal future world, as we envision it today, we no longer use
Foolscap, have no Tub IDs, and use HTTP to send signed unencrypted
storage-protocol messages from client to server. In that world, the
ECDSA pubkey is the serverid, and clients use signed announcements to
learn genuine information about servers.

But since we've been using foolscap tubids as serverids, and since we've
been using serverids to compute both serverlist-permutation and
shared-secrets, there is a compatibility concern. Any shares we've
uploaded to tubid-based servers must continue to be accessible with the
old serverid. I think this means that, even when we add an ECDSA pubkey
to those servers, they need to continue to use their tubid as serverid,
and we need to allow an announcement signed with pubkey-A to correctly
claim to have a serverid tubid-B. This will require a verification step,
in which the client connects to the tubid-B-bearing FURL, and the server
is expected to announce (over that channel) that it uses pubkey-A. Once
that is accomplished, the client can believe other metadata included in
the signed announcement.

This feels complex, so I'm uneasy about it, but here's the protocol I
have in mind:

* new-style announcements include two keys "serverid" and "serverid-type"
 - for brand-new servers on pure-V2 grids, serverid-type is "pubkey_v1",
   and serverid is the ASCII pubkey
 - for all other servers, serverid-type is "furl_v1", and serverid is
   base32-encoded tubid
* when client receives announcement from Introducer:
 - serverid-type="pubkey_v1": assert ann[serverid] matches signature
 - serverid-type="furl_v1": connect to ann[furl], send a
   "getAnnouncement" message, response is a new announcement (hopefully
   identical to the one from the Introducer), assert that serverid in
   new ann matches furl, then accept new announcement (ignoring the
   original).

== remaining work ==

* do we need version numbers in serverids? in signatures?
* serverid format: they are used in human-readable config files,
  permutation hash. Current serverids are binary. Make new ones ASCII?
* switch ECDSA hash step from SHA1 to SHA256
* clean up ascii/unicode transitions
  - plan for HTTP protocol
  - allow unicode in announcement dictionary (especially in nickname)
  - sign only well-defined binary data
  - manage signature handling well enough to allow variety of transport
    protocols (i.e. stick to ASCII)
* should we replace all data arguments with a single JSON string per
  message, in anticipation of switching the Introducer protocol to HTTP?
  (i.e. design the HTTP protocol first, then map it to Foolscap)
* turning it all on