[tahoe-lafs-trac-stream] [tahoe-lafs] #466: extendable Introducer protocol: dictionary-based, signed announcements

Sat Feb 12 13:07:36 PST 2011

#466: extendable Introducer protocol: dictionary-based, signed announcements
------------------------------+---------------------------------------------
     Reporter:  warner        |       Owner:  nejucomo                                                                                
         Type:  enhancement   |      Status:  new                                                                                     
     Priority:  major         |   Milestone:  1.9.0                                                                                   
    Component:  code-network  |     Version:  1.1.0                                                                                   
   Resolution:                |    Keywords:  introduction forward-compatibility performance accounting ecdsa pycryptopp review-needed
Launchpad Bug:                |  
------------------------------+---------------------------------------------

Comment (by warner):

 '''State of the patch'''

 I'd say this project is about 80% complete. The patch currently attached
 for review is split up into three pieces. This note is to explain what
 those pieces do, what the overall design is like, and what's left to
 design/build.

 == Why Do We Want This? ==

 The goal of this ticket is add signatures to our introducer
 announcements, and to make them extensible. Current announcements are a
 fixed 6-tuple: (FURL, service_name, remote_interface_name, nickname,
 version, oldest_supported). The new announcements will be an arbitrary
 JSON-serializable dictionary, embedded in a 3-tuple of (ann_json,
 signature, pubkey), where the sig/pubkey are optional.

 The utility of extensibility is pretty obvious: there are additional
 services and features we could enable (or make more efficient) if we
 could safely advertise them ahead of time through the introducer. In
 general, extensible protocol formats (dicts, not tuples) improves
 flexibility and enables change, since it's awfully difficult to make
 changes to all nodes simultaneously. Flexible announcements aren't
 strictly necessary: we could instead e.g. add new methods to the
 RIStorageServer object, and have clients speculatively attempt to invoke
 them, and gently tolerate !NameErrors, but that seems inelegant, and
 requires a roundtrip: by putting slowly-changing things in the
 announcements, clients learn about them earlier.

 The value of adding signatures is great, but not immediate. Signatures
 would bind the contents of an announcement to some "serverid",
 preventing other parties (other grid members, or the Introducer itself)
 from forging those contents. This would turn the introduction system
 into a secure channel from publisher to subscriber, indexed by serverid.

 Without signatures, anybody in the grid can publish anything they like,
 such as a record with Alice's real storage-server FURL but with a
 nickname of "Bob", which would currently replace Alice's real
 announcement.

 Currently we use Foolscap Tub IDs (i.e. hash of the tub certificate,
 which appears in each FURL) as server IDs. The only way to verify
 possession of the corresponding secret is to connect to a FURL that uses
 this tubid: Foolscap ensures that the object you connect to (and
 subsequent send callRemote messages to) is selected by the
 secret-holder. This requires an online check, whereas a signed message
 could be verified offline.

 We currently use serverids for three things:

  * to distinguish between storage servers who wish to be treated
    separately, sending separate shares to each
  * as a stable long-term seed for the permuted peerlist, used to decide
    how to distribute shares of each file
  * to calculate several shared secrets: the mutable-file
    "write-enabler", and the renew/expire lease tokens. These need to be
    different for each server, so that when a client exercises their
    authority on server A, that doesn't enable A to exercise the client's
    corresponding authority on server B

 In the future, we would like to also use serverids to:

  * enable explicit server selection: tahoe.cfg could list the serverids
    that uploads will use, ignoring all others, to protect the user's
    upstream bandwidth and reliability choices. This selection could be
    delegated to a central party, allowing the server list to change over
    time without constant user involvement.
  * correlate reciprocal Accounting relationships

 I *think* that a fully distributed introducer requires announcement
 signatures. With a single central Introducer, we could achieve some
 measure of control over the grid by restricting publishing access to
 certain servers. But with a highly distributed log-flood -based
 introduction system, we'd give up central control, and I think
 individually-traceable announcements would make up for that loss.

 Finally, a long-term goal is to move away from Foolscap to an HTTP-based
 protocol that is easier to implement in non-Python languages, to
 facilitate multiple implementations of the Tahoe protocol. Without
 Foolscap, we'll need a different mechanism to securely identify a
 server, for which the announcement signing keypair is appropriate.

 == patch 1: python-ecdsa ==

 https://github.com/warner/python-ecdsa/ is where I maintain a
 pure-python ECDSA library, with a not-too-bad API. I think I may want to
 make some changes to the API still. It's fast enough for use by signed
 announcements, since sign/verify operations occur only once per server.
 Once pycryptopp acquires ECDSA support, we should move to that, for the
 30x speedup.

 The library is embedded into {{{allmydata/src/util/ecdsa/*.py}}}, rather
 than being added as a dependency, for two reasons. First, extra
 dependencies are really making packagers lives difficult, slowing
 packaging efforts and thus hurting adoption. Second, changes to the
 upstream API will be easier to accomodate by using a fixed version of
 python-ecdsa, so I think it makes sense to copy it wholesale into the
 tahoe tree until the API stabilizes (which needs to be driven by using
 it and learning what works and what doesn't).

 == patch 2: keypair generation ==

 This adds {{{tahoe admin generate-keypair}}} and {{{derive-pubkey}}},
 basic userspace tools to work with keys. The basic idea is that some day
 you might use them to creates a value that you then paste into a
 tahoe.cfg file. They are currently unused.

 == patch 3: everything else ==

 I'll split this into:

  * terminology
  * V2 introducer protocol
  * announcement signatures
  * backwards compatibility with V1 protocol
  * serverid computation

 === Terminology ===

 The {{{IntroducerServer}}} is an object that lives in the introducer
 process, the one (just one, so far, but #68 will change that) identified
 by the {{{introducer.furl}}}. Most of this note calls this the "server".
 The {{{IntroducerClient}}} is an object that lives in each
 non-introducer process, both tahoe storage servers and tahoe clients,
 which manages the connection to the {{{IntroducerServer}}}.

 The "publisher" is a Tahoe node (usually a storage server) which wants
 to broadcast information about themselves to the whole grid. Each time
 they do this, they are said to "announce" their information. The bundle
 of information is called an "announcement".

 The "subscriber" is a Tahoe node (usually a client/gateway) that wants
 to receive announcements.

 Subscribers call their local {{{IntroducerClient.subscribe_to}}} to sign
 up to hear announcements, and provide a callback that will be invoked
 multiple times as they arrive. This provokes the {{{IntroducerClient}}}
 to send a "subscribe" message to the server. Later, the server will send
 an "announce" message to the client, and the {{{IntroducerClient}}} will
 fire the callback.

 Publishers call {{{IntroducerClient.publish}}} to deliver announcements,
 which provokes the {{{IntroducerClient}}} to send a "publish" message to
 the server, which provokes the server to send "announce" messages to all
 interested clients.

 === V2 Introducer Protocol ===

 The old V1 protocol used 6-tuples as announcements: (FURL, service_name,
 remoteinterface_name, nickname, my_version, oldest_supported). The new
 V2 protocol uses an open-ended JSON-serializable dictionary, with a
 number of top-level keys that are expected to be present.

 The V2 client->server "publish" message adds a "canary" argument, which
 allows the server to detect when the publisher has disconnected. This is
 unused so far, but the intent is to display liveness status on the
 server's "introweb" page, and to let them stop publishing data for
 servers which are offline (or perhaps have remained offline for several
 days).

 The V2 client->server "subscribe" message adds a "subscriber_info"
 argument, which lets the server's introweb page show information about
 each subscriber (mostly nickname and version). In the V1 protocol, this
 was accomplished by having each subscriber also "announce" a special
 "stub_client" service, which didn't correspond to a real service, but
 included enough information to build the status display. The V2
 subscriber_info field is defined (by foolscap schema) to be a dictionary
 with string keys, with at least "nickname", "my-version",
 "app-versions", and "oldest-supported".

 === Announcement Signatures ===

 V2 announcements on the wire are a 3-tuple (ann_d_json, sig_hex,
 pubkey_hex), in which the announcement is serialized with JSON. Unsigned
 announcements have {{{sig_hex == pubkey_hex == None}}}.

 We use 256-bit ECDSA keys (from the NIST256p curve, since that seemed to
 be the most widely implemented, in openssl/nss). {{{pubkey_hex}}} is a
 base16-encoded uncompressed raw binary key (TODO: versioning), the
 output of {{{VerifyingKey.to_string().encode("hex")}}}. This is 512 bits
 long, and neither includes an OID nor a 0x04 "uncompressed" flag byte.
 Signatures are computed with SHA1 (TODO: given NIST256p, let's use
 SHA256), with an algorithm that is compatible with openssl (verified by
 the python-ecdsa test suite). {{{sig_hex}}} is the output of
 {{{SigningKey.sign(ann_d_json.encode("utf-8")).encode("hex")}}} which
 uses python-ecdsa's minimal binary string encoding (no versioning
 information).

 Publishers give their {{{IntroducerClient.publish}}} both an
 announcement dictionary and a {{{SigningKey}}} instance (or None).
 Subscribers receive an announcement dictionary and a {{{VerifyfingKey}}}
 instance (which will be None if the announcement did not have a matching
 valid signature, which includes both unsigned announcements,
 forged/invalid signatures, and valid signatures from some different
 pubkey).

 === Backwards Compatibility with V1 ===

 There are four interesting V1+V2 compatibility cases, two on the
 publishing half, and two on the subscribing/announcement half.

 The V2 !IntroducerServer provides all the same method names as the V1
 server, plus additional announce_v2/subscribe_v2 methods that are only
 used by V2 clients. On the V1 methods, the V2 server accepts the same
 message format as the V1 server. The server seeks to hide the client
 versions from each other: a V1 client receives only V1-format
 announcements, and a V2 client receives only V2-format announcements,
 regardless of what client version generated those announcements.

 When a V1 client publishes to a V2 server, it uses the old "publish"
 method name, allowing the server to detect the client's old version. The
 server upconverts the V1-format announcement tuple into an unsigned
 V2-format dictionary, leaving some fields empty (like
 {{{ann_d["app_versions"]={}}}}) when necessary. It then dispatches this
 V2-format announcement internally as if it was received from a real V2
 client. When a V1 client subscribes to a V2 server, the old server-side
 "subscribe" method wraps the remote reference in a !SubscriberAdapter_v1,
 which behaves just like a remote reference to a modern V2 subscriber,
 but downconverts the messages to old-style V1 tuples before sending them
 over the wire.

 When a V2 client tries to publish an announcement, it first tries to
 invoke the new "publish_v2" method, with a V2-style announcement
 dictionary (maybe signed). If this callRemote fails in a way that looks
 like the server does not implement publish_v2, the client concludes that
 it is dealing with a V1 server. It then downconverts its annoucement to
 V1-style and sends it to the old V1 "publish" method.

 When a V2 client wants to subscribe, it first tries the new
 "subscribe_v2" method, passing itself as the desired recipient of
 "announce_v2" messages (the remote callback, in a sense). If the
 "subscribe_v2" method fails, the client concludes that it's dealing with
 a V1 server, and calls the old V1 "subscribe" method instead, passing a
 different object that can accept V1-style announcements. This
 client-side object upconverts each V1 announcement into a V2-format
 dictionary before internal delivery. The client also publishes a
 "stub_client" announcement, so that the V1 server can display nicknames
 and version numbers of all subscribed clients.

 === Server ID Computation ===

 This is the trickiest part, and ties into our eventual goals for having
 signed announcements.

 In the ideal future world, as we envision it today, we no longer use
 Foolscap, have no Tub IDs, and use HTTP to send signed unencrypted
 storage-protocol messages from client to server. In that world, the
 ECDSA pubkey is the serverid, and clients use signed announcements to
 learn genuine information about servers.

 But since we've been using foolscap tubids as serverids, and since we've
 been using serverids to compute both serverlist-permutation and
 shared-secrets, there is a compatibility concern. Any shares we've
 uploaded to tubid-based servers must continue to be accessible with the
 old serverid. I think this means that, even when we add an ECDSA pubkey
 to those servers, they need to continue to use their tubid as serverid,
 and we need to allow an announcement signed with pubkey-A to correctly
 claim to have a serverid tubid-B. This will require a verification step,
 in which the client connects to the tubid-B-bearing FURL, and the server
 is expected to announce (over that channel) that it uses pubkey-A. Once
 that is accomplished, the client can believe other metadata included in
 the signed announcement.

 This feels complex, so I'm uneasy about it, but here's the protocol I
 have in mind:

 * new-style announcements include two keys "serverid" and "serverid-type"
  - for brand-new servers on pure-V2 grids, serverid-type is "pubkey_v1",
    and serverid is the ASCII pubkey
  - for all other servers, serverid-type is "furl_v1", and serverid is
    base32-encoded tubid
 * when client receives announcement from Introducer:
  - serverid-type="pubkey_v1": assert ann[serverid] matches signature
  - serverid-type="furl_v1": connect to ann[furl], send a
    "getAnnouncement" message, response is a new announcement (hopefully
    identical to the one from the Introducer), assert that serverid in
    new ann matches furl, then accept new announcement (ignoring the
    original).

 == remaining work ==

 * do we need version numbers in serverids? in signatures?
 * serverid format: they are used in human-readable config files,
   permutation hash. Current serverids are binary. Make new ones ASCII?
 * switch ECDSA hash step from SHA1 to SHA256
 * clean up ascii/unicode transitions
   - plan for HTTP protocol
   - allow unicode in announcement dictionary (especially in nickname)
   - sign only well-defined binary data
   - manage signature handling well enough to allow variety of transport
     protocols (i.e. stick to ASCII)
 * should we replace all data arguments with a single JSON string per
   message, in anticipation of switching the Introducer protocol to HTTP?
   (i.e. design the HTTP protocol first, then map it to Foolscap)
 * turning it all on

-- 
Ticket URL: <http://tahoe-lafs.org/trac/tahoe-lafs/ticket/466#comment:17>
tahoe-lafs <http://tahoe-lafs.org>
secure decentralized storage