#466 closed enhancement (fixed)

extendable Introducer protocol: dictionary-based, signed announcements

Reported by: warner Owned by: warner
Priority: major Milestone: 1.10.0
Component: code-network Version: 1.1.0
Keywords: introduction forward-compatibility performance accounting ecdsa pycryptopp Cc: writefaruq@…
Launchpad Bug:

Description

Zooko and I have discussed a new API for the Introducer that would be more extensible for future uses. We need three features out of the new design:

  • announce dictionaries rather than tuples, so that we can add new keys to the announcements without affecting older subscribers who won't understand them
  • handle signed announcements: eventually we'll be using "blessed servers" to allow clients to restrict which storage servers they'll use (they can specify a public key, and only pay attention to announcements that were signed with the corresponding private key)
  • be compatible with v1.0/v1.1 clients, by using new remote_ methods and leaving the old methods in place.

This is one of the pre-requisites for implementing Accounting, since when we require clients to use rights-amplification to obtain a reference to an attenuated/labeled storage server object (instead of the current scheme in which everybody gets access to a full-powered server reference), we'll be announcing the "login" facet via a different interface name.

We're also planning on using the introducer to allow storage servers to update their announcement when they get full (or when they free enough space to resume accepting shares). This is a performance optimization: if a server is going to reject all shares, it is faster to remove it from the list of available writers than to ask and be rejected for each upload. The server should remain available for reading, though. We're not yet sure how we'll do this:

  • announce a "writeable" flag for each server
  • announce read-server and write-server separately

We should keep the idea of "updating your earlier announcement" in mind as we implement this improvement.

Attachments (3)

2011-02-p1.diff (98.6 KB) - added by warner at 2011-02-07T18:29:02Z.
copy python-ecdsa-0.7 into tree as allmydata.util.ecdsa
2011-02-p2.diff (9.3 KB) - added by warner at 2011-02-07T18:29:36Z.
add 'tahoe admin generate-keypair/derive-pubkey' commands, apply on top of p1
2011-02-p3.diff (109.2 KB) - added by warner at 2011-02-07T18:36:43Z.
new Introducer. see https://github.com/warner/tahoe-lafs/tree/466-introducer-take2 for updates

Download all attachments as: .zip

Change History (36)

comment:1 Changed at 2008-06-17T18:32:40Z by warner

More design notes:

  • each announcement is a certificate chain (to allow key-rotation, and probably some forms of extendability), so a variable-length list of certificates.
  • I'm thinking that we could use the same cert syntax for both introducer announcements and storage authority, so I'm kind of front-loading the design here.
  • Each certificate is a triple of (encoded-message, signature, pubkey-identifier).
    • The encoded-message is the actual content, probably a JSON-encoded dictionary (but of course then we must base64-encode any binary strings, since JSON can only carry unicode strings). Note that the encoded-message is a single string, and is generally left in encoded form until someone needs to know the contents. The only time encoding takes place is just before a signature is created. The encoded form *is* the canonical form.
    • The signature is an EC-DSA-192 signature of the encoded-message, using some public key. This is a single string. If the cert is unsigned, this will be a 0-length string.
    • The pubkey-identifier is optional: if omitted, it is a 0-length string. This helps a verifier figure out which pubkey was used to compute the signature. If omitted, the verifier must attempt to verify the signature against all root certs. If present, it is a strict prefix of the full serialized pubkey. Since we don't expect to have more than a handful of root certs, let's say this should be 4 bytes long.:
         for rootcert in roots:
             pubkey = rootcert.pubkey
             pubkey_s = pubkey.serialize()
             if pubkey_s.startswith(cert[2]):
                 if pubkey.verify(cert[0], cert[1]):
                     return (cert[0], rootcert) # valid, this is the parent cert
         raise InvalidCertificate("no matching root certificate found")
      
  • The announcement protocol will use a signature like:
    EncodedMessage = StringConstraint(regexp=PRINTABLE)
    Certificate = TupleOf(EncodedMessage, str, str)
    CertificateChain = ListOf(Certificate)
    class RIIntroducerPublisher(RemoteInterface):
        def publish(announcement=CertificateChain):
            return None
    
  • We'll need to define more serialization details for storage authority, to get the whole certificate chain down to a single string. The goals for that use case are:
    • tight encoding: no double-base32 encoding of anything, short strings, give up generality in deference to brevity
    • the final string should be base32- or base62- safe (so it can fit comfortably in a URL's query argument)
  • we need the encoded message to be serialized in some well-defined form (so that we know how to create its signature), but we also want flexibility for what goes into the encoded message, hence the JSON. For storage authority cert chains, we might not actually want that flexibility: if the client doesn't recognize one of the attenuations, it should fail-safe by rejecting the cert. There are a couple of competing goals here, and I'm not sure what the best answer is.

comment:2 Changed at 2008-06-17T19:04:18Z by warner

Versioning: the announcer API (possibly called publish_v2) will implicitly define the version of the certificate chain: publish_v2 is always called with a chain in a specific format. If we switch to different signature schemes (perhaps a different key length) in the future, they can use a new API call (publish_v3) which will include additional arguments as necessary, or which will deliver certs with a different format (possibly including additional data, like which signature scheme is in use).

This technique can be generalized: clients can attempt to call methname_vNN for the highest {{{NN}} they know, if they get a NameError then they can fall back to older interfaces (or fail), and they should remember which NN worked so they can avoid repeating the failure again the next time.

When the cert chain is serialized into a single string (for storage authority), we'll need to put a magic number and version identifier at the beginning of that string.

comment:3 Changed at 2008-06-19T07:44:50Z by warner

Actually, this is not really a prerequisite for accounting. We can just advertise "storage-login" instead of "storage" through the existing tuple-based interface.

This ticket is really related to "blessed storage servers", which are about protecting the client against low-reliability servers. Accounting is about protecting the servers against unauthorized clients.

So we can put this one off for a while.

comment:4 Changed at 2008-09-24T13:22:29Z by zooko

Is this a duplicate of #295?

comment:5 Changed at 2008-09-24T17:21:51Z by warner

not exactly, let's say that #295 is about distributed introduction, and this one (#466) is about signed/extendable announcements. I've updated the summary on #295 to match.

comment:6 Changed at 2008-10-07T02:03:21Z by warner

The code for this is pretty much done: my sandbox has a tree which does everything I want it to do, except for:

  • web display should show cert chains, not just last-cert pubkey
  • web display should clearly show which announcements are acceptable and which are not

I'm holding off on pushing it into trunk, though, for two reasons:

  • the ecdsa API in pycryptopp is going to change soon, in incompatible ways, so the tahoe code will need to be changed. Pushing this feature now would result in tahoe-1.3.0 being compatible with only pycryptopp-0.5.7 or older, which would be unfortunate (I don't like it when upgrading a supporting library causes the main application to break or start failing unit tests).
  • the ecdsa serialization format in pycryptopp is going to change soon (the #331 change), and if we can manage to not release a version of tahoe that uses the older (more fluffy) serialization scheme, then we can ease the versioning hassles a little bit.

I'll attach a diff with my changes, for safekeeping.

So this ticket is blocked on #331, and can be pushed to trunk (and then, barring problems, closed) shortly after #331 is finished and a new version of pycryptopp is released.

comment:7 Changed at 2009-03-01T04:38:15Z by warner

Just an update.. my patch for this will have bitrotted in the last 5 months.. enough trunk code has changed that it will no longer apply cleanly. I estimate it would take me about 3 or 4 days to bring it back up-to-date.

It is still quite blocked on ECDSA (#331). I can do about half the updating work without it (I'd have to create a fake ecdsa module and guess at what the eventual API will be).

comment:8 Changed at 2010-02-11T03:47:43Z by davidsarah

  • Keywords introducer forward-compatibility performance accounting added

comment:9 Changed at 2010-02-11T03:48:23Z by davidsarah

  • Keywords ecdsa pycryptopp added

comment:10 Changed at 2010-03-12T23:48:21Z by davidsarah

  • Keywords introduction added; introducer removed

comment:11 Changed at 2010-12-20T06:46:12Z by warner

Ok, that -ver6 patch is worth reviewing. I think there are a few more items I want to add before landing (more tests, mainly), but it's certainly worth talking about.

This version drops the certchain that was in the previous one: each announcement is either signed with a single pubkey, or not signed at all. It uses an embedded copy of my python-ecdsa library (https://github.com/warner/python-ecdsa), which is a bit on the slow side, but still fast enough for announcement/introducer use. It adds code to the IntroducerClient to add, distribute, and verify signatures, but does not add any code to the Client or StorageBroker to use those features: actual UI/tahoe.cfg switches to enable signing or require signatures is left for a future patch.

The patch creates a new version of the Introducer, as well as its protocols. V2 servers (i.e. Introducers) can accept connections from either V1 or V2 clients, and V2 clients can tolerate talking to a V1 server, so all combinations are covered. Signatures can only be passed from a V2 client, through a V2 server, off to another V2 client: any V1 components along the way will lose the signature.

Each announcement is a dictionary with keys to replace everything that was in the V1 protocol's tuples. I added some more version information (the full app-versions dict), which may be too much (especially if you're trying to be anonymous).

The quirkiest thing about this scheme is the relationship between FURL tubids and ECDSA pubkeys. The Introducer is supposed to recognize multiple announcements from the same source and let the new one replace the old one. When both are V1 tuples with the same tubid in their FURLs, or when both are V2 dicts signed by the same pubkey, the relationship is easy. If a client is upgraded from V1 to V2 and starts signing its announcements, the V2 announcement won't replace its old V1 announcement. I'm not sure how to handle this yet.

In addition, we need to think about how/if we want to transition serverids from being FURL-based to being pubkey-based. Since serverids are baked into shares (both via the permuted serverlist and, more directly, by the Write-Enablers embedded in each mutable share), we can't just casually change an existing server's id. For brand-new servers, we could switch to using the pubkey as the serverid: this might make non-Foolscap-based share-transport protocols easier to secure. For old servers with pre-existing shares that start signing their announcements, we should probably keep using their tubid as a serverid, perhaps even after we switch away from foolscap and to some ECDSA-signature based protocol. But we must validate it: a bad server could publish a FURL that they don't actually control, with a tubid that matches the server they wish to impersonate, knowing that we'll never actually connect to the FURL and discover the problem. I think the validation protocol will involve connecting to the FURL and receiving a copy of the pubkey back, to prove that the owner of the FURL really does want to be associated with that pubkey.

comment:12 Changed at 2010-12-20T16:02:41Z by warner

  • Keywords review-needed added

oops, forgot to set the flag

comment:13 Changed at 2010-12-24T23:13:48Z by warner

I think I've got an idea to clean up the "what is your serverid anyway?" issue. For reference, this is the problem:

  • a server which is signing its Announcements holds two secrets:
    • the Foolscap Tub private key, which is paired with the public TubID
    • the ECDSA signing key, which is paired with the public verifying-key / keyid that shows up in the signed announcements
  • For many aspects of Tahoe's share-placement behavior, we need a reliable, stable, and unspoofable "serverid":
    • serverlist permutation
    • write-enabler secrets
    • consistent server reliability (if you think your share is safe because it's on server A, but someone else was able to make you put the share on server B instead, then you aren't getting this. Note that server A is allowed to unilaterally delegate its storage duties: the failure mode is if someone other than you and/or server A is able to make your share go elsewhere).
    • and eventually accounting stuff
  • To be unspoofable, the serverid needs to be the public half of some cryptographic pair, so either the TubID or the ECDSA keyid would do. To be stable for all current servers, it needs to be the TubID, since that's what we've been using since the beginning. But eventually we hope to get away from Foolscap and move to an easier-to-port-to-other-languages protocol, so eventually we'd prefer to use the ECDSA keyid instead.
  • To use the serverid safely, the client must validate it before allowing any of the storage-client code (Uploader/Downloader) to rely upon it.

So today's thought is:

  • the Introducer Announcement dictionary includes two new keys: 'serverid' and 'serverid-type'
  • the StorageFarmBroker (which subscribes to hear about storage announcements and manages RIStorageServer connections) keeps two lists: unvalidated announcements and validated connections. It only tells Uploader/Downloader about the validated connections, in a table indexed by serverid.
  • future world: new servers (who have never been known by their tubid) will set serverid-type to "pubkey_v1".
    • The broker will validate the announcement by merely comparing serverid against the keyid which was used to sign the announcement. (IntroducerClient already checks signatures and discards invalid ones, so the subscribe interface's callback is invoked with a keyid and an announcement dictionary).
    • these new-world servers can still accept Foolscap connections, but clients will connect to whatever FURL is in the announcement and won't pay attention to that FURL's tubid: they will strictly use the pubkey as a serverid.
  • current world: servers who *have* been known as a tubid will continue to do so: they will set serverid-type to "FURL_v1". When the broker sees this, it will:
    • compare the TubID of the claimed FURL against the announcement's serverid, abort if they do not match
    • connect to the FURL listed in the announcement
    • invoke a new remote method, maybe named "get_pubkey"
    • abort if the returned pubkey does not match the one used to sign the announcement
    • if they do match, validate the connection under the serverid

So when a get_announcement method returns a pubkey, that Tub is delegating all its authority to the signing key. (this authority includes the right to know the write-enablers, and the right to decide where its shares are placed).

The first check (FURL.TubID vs serverid is nominally redundant (why not just extract the tubid and use it as the serverid), but I expect it will be convenient to keep a list of validated announcements around, and we'd like to be able to trust the serverid fields therein. It also provides useful non-validated data in some places (like the Introducer's web status page).

The second check (get_pubkey()==serverid) blocks a bunch of noise: bogus announcements made by someone other than the Tub operator, but pointing at that Tub. These announcements could contain false information about server capabilities, free space, versioning, etc (everything in the announcement dictionary). By making sure that the Tub in question is actually willing to be represented (in announcements) by that pubkey, we prevent anyone else from making these false claims.

The legacy authority we're trying to protect here is the tubid-based serverid. By restricting that authority to servers who can actually provide an object at that Tub, we prevent other parties from being able to claim that authority.

It's still an open question in my mind whether this transition is a good idea overall. I think it will help Tahoe if we can write clients in languages other than Python, and relying upon Foolscap is a barrier to that (we could port Foolscap to other languages, but that feels like more work for less gain than changing Tahoe). If that's a good direction to go, then switching to a serverid based on some cryptographic quality *other* than a foolscap tubid is a necessary step.

I just worry about about the existing servers and the long-term baggage that they'll need to carry around (since all their mutable shares are tied to the serverid, they'll need to keep a Tub around forever, to prove knowledge of the Tub private key, even if all the connections are using ECDSA-signed HTTP messages or whatnot). (incidentally, this might be relaxed somewhat if foolscap#19 were implemented, allowing the SSL key to be used for signing arbitrary data, because then we could publish the public cert and its signature on the ECDSA pubkey in the announcement. Both sides would still need to do some SSL work to verify the signature, but they wouldn't strictly need to use Foolscap to do it, nor would they need to use foolscap connections to move shares around.

comment:14 Changed at 2011-01-02T19:44:08Z by nejucomo

  • Owner set to nejucomo

I've begun a review of this ticket. Some initial questions from the first design comment follow. I may answer some of these by reading the patch and the trunk version.

Q1. Why does the design specify EC-DSA-192? What are the requirements which drive this algorithm selection?

Zooko suggested on IRC that the primary goal is small public keys.

Q2. Why is the pubkey-identifier optional? What use case does this facilitate?

Q3. If the identifier is absent, verification is checked against all "root certs". How are these managed?

comment:15 Changed at 2011-01-04T19:33:08Z by warner

nejucomo: please start your review with the latest patch (466-ver6.diff). The design has changed a lot since this ticket was first opened, and most of those questions are no longer applicable.

comment:16 Changed at 2011-01-06T00:40:35Z by davidsarah

  • Milestone changed from undecided to 1.9.0

Changed at 2011-02-07T18:29:02Z by warner

copy python-ecdsa-0.7 into tree as allmydata.util.ecdsa

Changed at 2011-02-07T18:29:36Z by warner

add 'tahoe admin generate-keypair/derive-pubkey' commands, apply on top of p1

Changed at 2011-02-07T18:36:43Z by warner

comment:17 Changed at 2011-02-12T21:07:36Z by warner

State of the patch

I'd say this project is about 80% complete. The patch currently attached for review is split up into three pieces. This note is to explain what those pieces do, what the overall design is like, and what's left to design/build.

Why Do We Want This?

The goal of this ticket is add signatures to our introducer announcements, and to make them extensible. Current announcements are a fixed 6-tuple: (FURL, service_name, remote_interface_name, nickname, version, oldest_supported). The new announcements will be an arbitrary JSON-serializable dictionary, embedded in a 3-tuple of (ann_json, signature, pubkey), where the sig/pubkey are optional.

The utility of extensibility is pretty obvious: there are additional services and features we could enable (or make more efficient) if we could safely advertise them ahead of time through the introducer. In general, extensible protocol formats (dicts, not tuples) improves flexibility and enables change, since it's awfully difficult to make changes to all nodes simultaneously. Flexible announcements aren't strictly necessary: we could instead e.g. add new methods to the RIStorageServer object, and have clients speculatively attempt to invoke them, and gently tolerate NameErrors, but that seems inelegant, and requires a roundtrip: by putting slowly-changing things in the announcements, clients learn about them earlier.

The value of adding signatures is great, but not immediate. Signatures would bind the contents of an announcement to some "serverid", preventing other parties (other grid members, or the Introducer itself) from forging those contents. This would turn the introduction system into a secure channel from publisher to subscriber, indexed by serverid.

Without signatures, anybody in the grid can publish anything they like, such as a record with Alice's real storage-server FURL but with a nickname of "Bob", which would currently replace Alice's real announcement.

Currently we use Foolscap Tub IDs (i.e. hash of the tub certificate, which appears in each FURL) as server IDs. The only way to verify possession of the corresponding secret is to connect to a FURL that uses this tubid: Foolscap ensures that the object you connect to (and subsequent send callRemote messages to) is selected by the secret-holder. This requires an online check, whereas a signed message could be verified offline.

We currently use serverids for three things:

  • to distinguish between storage servers who wish to be treated separately, sending separate shares to each
  • as a stable long-term seed for the permuted peerlist, used to decide how to distribute shares of each file
  • to calculate several shared secrets: the mutable-file "write-enabler", and the renew/expire lease tokens. These need to be different for each server, so that when a client exercises their authority on server A, that doesn't enable A to exercise the client's corresponding authority on server B

In the future, we would like to also use serverids to:

  • enable explicit server selection: tahoe.cfg could list the serverids that uploads will use, ignoring all others, to protect the user's upstream bandwidth and reliability choices. This selection could be delegated to a central party, allowing the server list to change over time without constant user involvement.
  • correlate reciprocal Accounting relationships

I *think* that a fully distributed introducer requires announcement signatures. With a single central Introducer, we could achieve some measure of control over the grid by restricting publishing access to certain servers. But with a highly distributed log-flood -based introduction system, we'd give up central control, and I think individually-traceable announcements would make up for that loss.

Finally, a long-term goal is to move away from Foolscap to an HTTP-based protocol that is easier to implement in non-Python languages, to facilitate multiple implementations of the Tahoe protocol. Without Foolscap, we'll need a different mechanism to securely identify a server, for which the announcement signing keypair is appropriate.

patch 1: python-ecdsa

https://github.com/warner/python-ecdsa/ is where I maintain a pure-python ECDSA library, with a not-too-bad API. I think I may want to make some changes to the API still. It's fast enough for use by signed announcements, since sign/verify operations occur only once per server. Once pycryptopp acquires ECDSA support, we should move to that, for the 30x speedup.

The library is embedded into allmydata/src/util/ecdsa/*.py, rather than being added as a dependency, for two reasons. First, extra dependencies are really making packagers lives difficult, slowing packaging efforts and thus hurting adoption. Second, changes to the upstream API will be easier to accomodate by using a fixed version of python-ecdsa, so I think it makes sense to copy it wholesale into the tahoe tree until the API stabilizes (which needs to be driven by using it and learning what works and what doesn't).

patch 2: keypair generation

This adds tahoe admin generate-keypair and derive-pubkey, basic userspace tools to work with keys. The basic idea is that some day you might use them to creates a value that you then paste into a tahoe.cfg file. They are currently unused.

patch 3: everything else

I'll split this into:

  • terminology
  • V2 introducer protocol
  • announcement signatures
  • backwards compatibility with V1 protocol
  • serverid computation

Terminology

The IntroducerServer is an object that lives in the introducer process, the one (just one, so far, but #68 will change that) identified by the introducer.furl. Most of this note calls this the "server". The IntroducerClient is an object that lives in each non-introducer process, both tahoe storage servers and tahoe clients, which manages the connection to the IntroducerServer.

The "publisher" is a Tahoe node (usually a storage server) which wants to broadcast information about themselves to the whole grid. Each time they do this, they are said to "announce" their information. The bundle of information is called an "announcement".

The "subscriber" is a Tahoe node (usually a client/gateway) that wants to receive announcements.

Subscribers call their local IntroducerClient.subscribe_to to sign up to hear announcements, and provide a callback that will be invoked multiple times as they arrive. This provokes the IntroducerClient to send a "subscribe" message to the server. Later, the server will send an "announce" message to the client, and the IntroducerClient will fire the callback.

Publishers call IntroducerClient.publish to deliver announcements, which provokes the IntroducerClient to send a "publish" message to the server, which provokes the server to send "announce" messages to all interested clients.

V2 Introducer Protocol

The old V1 protocol used 6-tuples as announcements: (FURL, service_name, remoteinterface_name, nickname, my_version, oldest_supported). The new V2 protocol uses an open-ended JSON-serializable dictionary, with a number of top-level keys that are expected to be present.

The V2 client->server "publish" message adds a "canary" argument, which allows the server to detect when the publisher has disconnected. This is unused so far, but the intent is to display liveness status on the server's "introweb" page, and to let them stop publishing data for servers which are offline (or perhaps have remained offline for several days).

The V2 client->server "subscribe" message adds a "subscriber_info" argument, which lets the server's introweb page show information about each subscriber (mostly nickname and version). In the V1 protocol, this was accomplished by having each subscriber also "announce" a special "stub_client" service, which didn't correspond to a real service, but included enough information to build the status display. The V2 subscriber_info field is defined (by foolscap schema) to be a dictionary with string keys, with at least "nickname", "my-version", "app-versions", and "oldest-supported".

Announcement Signatures

V2 announcements on the wire are a 3-tuple (ann_d_json, sig_hex, pubkey_hex), in which the announcement is serialized with JSON. Unsigned announcements have sig_hex == pubkey_hex == None.

We use 256-bit ECDSA keys (from the NIST256p curve, since that seemed to be the most widely implemented, in openssl/nss). pubkey_hex is a base16-encoded uncompressed raw binary key (TODO: versioning), the output of VerifyingKey.to_string().encode("hex"). This is 512 bits long, and neither includes an OID nor a 0x04 "uncompressed" flag byte. Signatures are computed with SHA1 (TODO: given NIST256p, let's use SHA256), with an algorithm that is compatible with openssl (verified by the python-ecdsa test suite). sig_hex is the output of SigningKey.sign(ann_d_json.encode("utf-8")).encode("hex") which uses python-ecdsa's minimal binary string encoding (no versioning information).

Publishers give their IntroducerClient.publish both an announcement dictionary and a SigningKey instance (or None). Subscribers receive an announcement dictionary and a VerifyfingKey instance (which will be None if the announcement did not have a matching valid signature, which includes both unsigned announcements, forged/invalid signatures, and valid signatures from some different pubkey).

Backwards Compatibility with V1

There are four interesting V1+V2 compatibility cases, two on the publishing half, and two on the subscribing/announcement half.

The V2 IntroducerServer provides all the same method names as the V1 server, plus additional announce_v2/subscribe_v2 methods that are only used by V2 clients. On the V1 methods, the V2 server accepts the same message format as the V1 server. The server seeks to hide the client versions from each other: a V1 client receives only V1-format announcements, and a V2 client receives only V2-format announcements, regardless of what client version generated those announcements.

When a V1 client publishes to a V2 server, it uses the old "publish" method name, allowing the server to detect the client's old version. The server upconverts the V1-format announcement tuple into an unsigned V2-format dictionary, leaving some fields empty (like ann_d["app_versions"]={}) when necessary. It then dispatches this V2-format announcement internally as if it was received from a real V2 client. When a V1 client subscribes to a V2 server, the old server-side "subscribe" method wraps the remote reference in a !SubscriberAdapter_v1, which behaves just like a remote reference to a modern V2 subscriber, but downconverts the messages to old-style V1 tuples before sending them over the wire.

When a V2 client tries to publish an announcement, it first tries to invoke the new "publish_v2" method, with a V2-style announcement dictionary (maybe signed). If this callRemote fails in a way that looks like the server does not implement publish_v2, the client concludes that it is dealing with a V1 server. It then downconverts its annoucement to V1-style and sends it to the old V1 "publish" method.

When a V2 client wants to subscribe, it first tries the new "subscribe_v2" method, passing itself as the desired recipient of "announce_v2" messages (the remote callback, in a sense). If the "subscribe_v2" method fails, the client concludes that it's dealing with a V1 server, and calls the old V1 "subscribe" method instead, passing a different object that can accept V1-style announcements. This client-side object upconverts each V1 announcement into a V2-format dictionary before internal delivery. The client also publishes a "stub_client" announcement, so that the V1 server can display nicknames and version numbers of all subscribed clients.

Server ID Computation

This is the trickiest part, and ties into our eventual goals for having signed announcements.

In the ideal future world, as we envision it today, we no longer use Foolscap, have no Tub IDs, and use HTTP to send signed unencrypted storage-protocol messages from client to server. In that world, the ECDSA pubkey is the serverid, and clients use signed announcements to learn genuine information about servers.

But since we've been using foolscap tubids as serverids, and since we've been using serverids to compute both serverlist-permutation and shared-secrets, there is a compatibility concern. Any shares we've uploaded to tubid-based servers must continue to be accessible with the old serverid. I think this means that, even when we add an ECDSA pubkey to those servers, they need to continue to use their tubid as serverid, and we need to allow an announcement signed with pubkey-A to correctly claim to have a serverid tubid-B. This will require a verification step, in which the client connects to the tubid-B-bearing FURL, and the server is expected to announce (over that channel) that it uses pubkey-A. Once that is accomplished, the client can believe other metadata included in the signed announcement.

This feels complex, so I'm uneasy about it, but here's the protocol I have in mind:

  • new-style announcements include two keys "serverid" and "serverid-type"
    • for brand-new servers on pure-V2 grids, serverid-type is "pubkey_v1", and serverid is the ASCII pubkey
    • for all other servers, serverid-type is "furl_v1", and serverid is base32-encoded tubid
  • when client receives announcement from Introducer:
    • serverid-type="pubkey_v1": assert ann[serverid] matches signature
    • serverid-type="furl_v1": connect to ann[furl], send a "getAnnouncement" message, response is a new announcement (hopefully identical to the one from the Introducer), assert that serverid in new ann matches furl, then accept new announcement (ignoring the original).

remaining work

  • do we need version numbers in serverids? in signatures?
  • serverid format: they are used in human-readable config files, permutation hash. Current serverids are binary. Make new ones ASCII?
  • switch ECDSA hash step from SHA1 to SHA256
  • clean up ascii/unicode transitions
    • plan for HTTP protocol
    • allow unicode in announcement dictionary (especially in nickname)
    • sign only well-defined binary data
    • manage signature handling well enough to allow variety of transport protocols (i.e. stick to ASCII)
  • should we replace all data arguments with a single JSON string per message, in anticipation of switching the Introducer protocol to HTTP? (i.e. design the HTTP protocol first, then map it to Foolscap)
  • turning it all on

comment:18 Changed at 2011-02-21T22:49:01Z by writefaruq

  • Cc writefaruq@… added

comment:19 Changed at 2011-07-16T21:28:13Z by davidsarah

  • Owner changed from nejucomo to warner

Brian's opinion needed on whether this will be ready for 1.9.

comment:20 Changed at 2011-07-25T06:04:40Z by warner

  • Milestone changed from 1.9.0 to 1.10.0

sigh, no. I grow increasingly less hopeful that this will ever see the light of day. Bumping out of 1.9

comment:21 Changed at 2011-08-01T03:56:50Z by warner

  • Keywords review-needed removed

comment:22 Changed at 2011-11-20T19:50:31Z by warner

ok, we're getting close here. The current work is on my github branch:

https://github.com/warner/tahoe-lafs/tree/466-take8

(note: I rebase this branch frequently. Also, I switch to subsequent "takeNN" branches when the need arises)

We went (quickly) over the design and code at the Summit, and I fixed all the issues that were raised there. So the branch is ready to land once the necessary dependencies are in place. The sequence from here is:

  • Zooko lands the patch in pycryptopp#75
  • Zooko makes a new release of pycryptopp (maybe named 0.6?)
  • we update tahoe's _autodeps.py to depend on the new pycryptopp
  • we land my 466-take8 branch
  • ???
  • profit!

comment:23 Changed at 2011-11-30T00:57:09Z by davidsarah

A darcs patch against trunk corresponding to the git branch is at https://tahoe-lafs.org/~davidsarah/patches/introducer-signed-messages.darcs.patch.

I made some minor changes to the code and tests for the tahoe admin commands (see the patch descriptions), and fixed some duplicate umids; apart from that the code in the darcs patch is the same. I confirmed that it passes tests on Ubuntu Maverick x86-64, provided that pycryptopp with the https://tahoe-lafs.org/trac/pycryptopp/ticket/75 patch is installed, and modulo some failures due to #1586.

comment:24 Changed at 2011-11-30T01:48:09Z by davidsarah

At warner's request, I rerecorded the darcs patches so that my changes are in separate patches to the changes from the git branch. They are at https://tahoe-lafs.org/~davidsarah/patches/introducer-signed-messages-v2.darcs.patch.

Note that 'two.diff' (output of git show a604a) is still split into two darcs patches, for code and test changes.

comment:25 Changed at 2012-03-14T01:26:57Z by Brian Warner <warner@…>

In bc21726dfd73b434:

new introducer: signed extensible dictionary-based messages! refs #466

This introduces new client and server halves to the Introducer (renaming the
old one with a _V1 suffix). Both have fallbacks to accomodate talking to a
different version: the publishing client switches on whether the server's
.get_version() advertises V2 support, the server switches on which
subscription method was invoked by the subscribing client.

The V2 protocol sends a three-tuple of (serialized announcement dictionary,
signature, pubkey) for each announcement. The V2 server dispatches messages
to subscribers according to the service-name, and throws errors for invalid
signatures, but does not otherwise examine the messages. The V2 receiver's
subscription callback will receive a (serverid, ann_dict) pair. The
'serverid' will be equal to the pubkey if all of the following are true:

the originating client is V2, and was told a privkey to use
the announcement went through a V2 server
the signature is valid

If not, 'serverid' will be equal to the tubid portion of the announced FURL,
as was the case for V1 receivers.

Servers will create a keypair if one does not exist yet, stored in
private/server.privkey .

The signed announcement dictionary puts the server FURL in a key named
"anonymous-storage-FURL", which anticipates upcoming Accounting-related
changes in the server advertisements. It also provides a key named
"permutation-seed-base32" to tell clients what permutation seed to use. This
is computed at startup, using tubid if there are existing shares, otherwise
the pubkey, to retain share-order compatibility for existing servers.

comment:26 Changed at 2012-03-14T01:27:42Z by Brian Warner <warner@…>

In bc21726dfd73b434:

new introducer: signed extensible dictionary-based messages! refs #466

This introduces new client and server halves to the Introducer (renaming the
old one with a _V1 suffix). Both have fallbacks to accomodate talking to a
different version: the publishing client switches on whether the server's
.get_version() advertises V2 support, the server switches on which
subscription method was invoked by the subscribing client.

The V2 protocol sends a three-tuple of (serialized announcement dictionary,
signature, pubkey) for each announcement. The V2 server dispatches messages
to subscribers according to the service-name, and throws errors for invalid
signatures, but does not otherwise examine the messages. The V2 receiver's
subscription callback will receive a (serverid, ann_dict) pair. The
'serverid' will be equal to the pubkey if all of the following are true:

the originating client is V2, and was told a privkey to use
the announcement went through a V2 server
the signature is valid

If not, 'serverid' will be equal to the tubid portion of the announced FURL,
as was the case for V1 receivers.

Servers will create a keypair if one does not exist yet, stored in
private/server.privkey .

The signed announcement dictionary puts the server FURL in a key named
"anonymous-storage-FURL", which anticipates upcoming Accounting-related
changes in the server advertisements. It also provides a key named
"permutation-seed-base32" to tell clients what permutation seed to use. This
is computed at startup, using tubid if there are existing shares, otherwise
the pubkey, to retain share-order compatibility for existing servers.

comment:27 follow-up: Changed at 2012-03-14T02:50:29Z by davidsarah

Servers will create a keypair if one does not exist yet, stored in private/server.privkey .

Existing private key files are called *.pem, e.g. node.pem.

comment:28 follow-up: Changed at 2012-03-14T02:54:16Z by davidsarah

Don't we need to make Tahoe dependent on "pycryptopp >= 0.6.0"?

comment:29 in reply to: ↑ 28 Changed at 2012-03-14T03:39:56Z by davidsarah

Replying to davidsarah:

Don't we need to make Tahoe dependent on "pycryptopp >= 0.6.0"?

Oh, it is, but the addition to install_requires is done in the require_more function of src/allmydata/_auto_deps.py. It could just as well be in the static definition of install_requires, since it's unconditional.

comment:30 in reply to: ↑ 27 ; follow-up: Changed at 2012-03-14T06:04:16Z by warner

Replying to davidsarah:

Servers will create a keypair if one does not exist yet, stored in private/server.privkey .

Existing private key files are called *.pem, e.g. node.pem.

This new server key is an Ed25519 key (stored as 32 binary bytes), not an X509/SSL key, so .pem didn't seem appropriate. I'm happy to use a different name, though.

It could just as well be in the static definition of install_requires, since it's unconditional.

Good point, I'll move it there.

comment:31 in reply to: ↑ 30 Changed at 2012-03-14T21:20:31Z by davidsarah

Replying to warner:

Replying to davidsarah:

Existing private key files are called *.pem, e.g. node.pem.

This new server key is an Ed25519 key (stored as 32 binary bytes), not an X509/SSL key, so .pem didn't seem appropriate. I'm happy to use a different name, though.

Oh, in that case server.privkey is fine.

comment:32 Changed at 2012-04-01T00:31:20Z by davidsarah

  • Milestone changed from 1.11.0 to 1.10.0

comment:33 Changed at 2012-05-14T23:18:09Z by warner

  • Resolution set to fixed
  • Status changed from new to closed

Ok, time to close this one out. The code landed a while ago, and things look stable. A few more notes about the final protocol that we settled on (and how it differs from the big explanation in comment:17) :

  • announcements contain separate fields for the different uses of a "server id". This turned out to work better than having a "serverid-type" marker.
    • anonymous-storage-FURL: points at the foolscap object that provides non-Accounting-based storage service, using the same protocol we've been using for years. It has the "anonymous-" prefix to distinguish it from the non-anonymous thing that Accounting (#666) will provide. When a server requires accounting-based connections, its announcement will omit anonymous-storage-FURL.
    • permutation-seed-base32: tells clients where to place this server in the permuted ring, for share-placement purposes. The server gets to decide this placement independently of its serverid or tubid (this is marginally more power than they had before, but we decided it was safe). New servers use the public-key -based serverid for this. Old servers (those which have published shares under their tubid) keep using their tubid for this, to maintain stability (clients keep looking for shares in the same old place). The code in client.py:_init_permutation_seed() holds this "Am I old or new" logic.
  • the IntroducerClient which receives these announcements produces an IServer object, specifically a allmydata.storage_client.NativeStorageServer, which offers methods that client code can use to figure out how to talk to the server.
    • get_permutation_seed(): maps directly to permutation-seed-base32
    • get_lease_seed() and get_foolscap_write_enabler_seed(): both return the *tubid*, never the pubkey-based serverid, because these are used for shared-secret authorization of add-lease/remove-lease/modify-share operations, and shared-secrets are only safe to inside a channel that's tied to those secrets. Since these operations are closely tied to Foolscap, it's ok to leave them using the tubid. When we move to a non-Foolscap transport layer, we'll need to use different authorization mechanisms for leases and write-enablers, and these will go away.
  • the pubkey-based serverid will be used for all future explicit-server-selection and Accounting needs, never the tubid
  • the code uses ed25519 signatures (from pycryptopp), not python-ecdsa. These signatures are shorter, faster, and more secure.
  • serverids have "v0-" version prefixes, as do the signatures. The pubkey and signatures are sent as base32-encoded strings, to make non-binary-safe transports easier to accomodate. When presenting serverids to users, the v0- prefix is removed.
  • announcements are JSONable dictionaries with unicode fields. The signature is computed over its UTF-8 encoding.
  • the new protocol is enabled by default. As soon as you start a node with the new code, it will generate a new key and start using it.
  • a copy of the old (V1) introducer client+server is retained in src/allmydata/introducer/old.py for testing purposes. Once we're no longer concerned with maintaining compatibility with V1 introducers or clients, we can remove it.
Note: See TracTickets for help on using tickets.