| 1 | .. -*- coding: utf-8-with-signature -*- |
|---|
| 2 | |
|---|
| 3 | ======================= |
|---|
| 4 | Node Keys in Tahoe-LAFS |
|---|
| 5 | ======================= |
|---|
| 6 | |
|---|
| 7 | "Node Keys" are cryptographic signing/verifying keypairs used to |
|---|
| 8 | identify Tahoe-LAFS nodes (client-only and client+server). The private |
|---|
| 9 | signing key is stored in NODEDIR/private/node.privkey , and is used to |
|---|
| 10 | sign the announcements that are distributed to all nodes by the |
|---|
| 11 | Introducer. The public verifying key is used to identify the sending |
|---|
| 12 | node from those other systems: it is displayed as a "Node ID" that looks |
|---|
| 13 | like "v0-abc234xyz567..", which ends with a long base32-encoded string. |
|---|
| 14 | |
|---|
| 15 | These node keys were introduced in the 1.10 release (April 2013), as |
|---|
| 16 | part of ticket #466. In previous releases, announcements were unsigned, |
|---|
| 17 | and nodes were identified by their Foolscap "Tub ID" (a somewhat shorter |
|---|
| 18 | base32 string, with no "v0-" prefix). |
|---|
| 19 | |
|---|
| 20 | Why Announcements Are Signed |
|---|
| 21 | ---------------------------- |
|---|
| 22 | |
|---|
| 23 | All nodes (both client-only and client+server) publish announcements to |
|---|
| 24 | the Introducer, which then relays them to all other nodes. These |
|---|
| 25 | announcements contain information about the publishing node's nickname, |
|---|
| 26 | how to reach the node, what services it offers, and what version of code |
|---|
| 27 | it is running. |
|---|
| 28 | |
|---|
| 29 | The new private node key is used to sign these announcements, preventing |
|---|
| 30 | the Introducer from modifying their contents en-route. This will enable |
|---|
| 31 | future versions of Tahoe-LAFS to use other forms of introduction |
|---|
| 32 | (gossip, multiple introducers) without weakening the security model. |
|---|
| 33 | |
|---|
| 34 | The Node ID is useful as a handle with which to talk about a node. For |
|---|
| 35 | example, when clients eventually gain the ability to control which |
|---|
| 36 | storage servers they are willing to use (#467), the configuration file |
|---|
| 37 | might simply include a list of Node IDs for the approved servers. |
|---|
| 38 | |
|---|
| 39 | TubIDs are currently also suitable for this job, but they depend upon |
|---|
| 40 | having a Foolscap connection to the server. Since our goal is to move |
|---|
| 41 | away from Foolscap towards a simpler (faster and more portable) |
|---|
| 42 | protocol, we want to reduce our dependence upon TubIDs. Node IDs and |
|---|
| 43 | Ed25519 signatures can be used for non-Foolscap non-SSL based protocols. |
|---|
| 44 | |
|---|
| 45 | How The Node ID Is Computed |
|---|
| 46 | --------------------------- |
|---|
| 47 | |
|---|
| 48 | The long-form Node ID is the Ed25519 public verifying key, 256 bits (32 |
|---|
| 49 | bytes) long, base32-encoded, with a "v0-" prefix appended, and the |
|---|
| 50 | trailing "=" padding removed, like so: |
|---|
| 51 | |
|---|
| 52 | v0-rlj3jnxqv4ee5rtpyngvzbhmhuikjfenjve7j5mzmfcxytwmyf6q |
|---|
| 53 | |
|---|
| 54 | The Node ID is displayed in this long form on the node's front Welcome |
|---|
| 55 | page, and on the Introducer's status page. In most other places |
|---|
| 56 | (share-placement lists, file health displays), the "short form" is used |
|---|
| 57 | instead. This is simply the first 8 characters of the base32 portion, |
|---|
| 58 | frequently enclosed in square brackets, like this: |
|---|
| 59 | |
|---|
| 60 | [rlj3jnxq] |
|---|
| 61 | |
|---|
| 62 | In contrast, old-style TubIDs are usually displayed with just 6 base32 |
|---|
| 63 | characters. |
|---|
| 64 | |
|---|
| 65 | Version Compatibility, Fallbacks For Old Versions |
|---|
| 66 | ------------------------------------------------- |
|---|
| 67 | |
|---|
| 68 | Since Tahoe-LAFS 1.9 does not know about signed announcements, 1.10 |
|---|
| 69 | includes backwards-compatibility code to allow old and new versions to |
|---|
| 70 | interoperate. There are three relevant participants: the node publishing |
|---|
| 71 | an announcement, the Introducer which relays them, and the node |
|---|
| 72 | receiving the (possibly signed) announcement. |
|---|
| 73 | |
|---|
| 74 | When a 1.10 node connects to an old Introducer (version 1.9 or earlier), |
|---|
| 75 | it sends downgraded non-signed announcements. It likewise accepts |
|---|
| 76 | non-signed announcements from the Introducer. The non-signed |
|---|
| 77 | announcements use TubIDs to identify the sending node. The new 1.10 |
|---|
| 78 | Introducer, when it connects to an old node, downgrades any signed |
|---|
| 79 | announcements to non-signed ones before delivery. |
|---|
| 80 | |
|---|
| 81 | As a result, the only way to receive signed announcements is for all |
|---|
| 82 | three systems to be running the new 1.10 code. In a grid with a mixture |
|---|
| 83 | of old and new nodes, if the Introducer is old, then all nodes will see |
|---|
| 84 | unsigned TubIDs. If the Introducer is new, then nodes will see signed |
|---|
| 85 | Node IDs whenever possible. |
|---|
| 86 | |
|---|
| 87 | Share Placement |
|---|
| 88 | --------------- |
|---|
| 89 | |
|---|
| 90 | Tahoe-LAFS uses a "permuted ring" algorithm to decide where to place |
|---|
| 91 | shares for any given file. For each potential server, it uses that |
|---|
| 92 | server's "permutation seed" to compute a pseudo-random but deterministic |
|---|
| 93 | location on a ring, then walks the ring in clockwise order, asking each |
|---|
| 94 | server in turn to hold a share until all are placed. When downloading a |
|---|
| 95 | file, the servers are accessed in the same order. This minimizes the |
|---|
| 96 | number of queries that must be done to download a file, and tolerates |
|---|
| 97 | "churn" (nodes being added and removed from the grid) fairly well. |
|---|
| 98 | |
|---|
| 99 | This property depends upon server nodes having a stable permutation |
|---|
| 100 | seed. If a server's permutation seed were to change, it would |
|---|
| 101 | effectively wind up at a randomly selected place on the permuted ring. |
|---|
| 102 | Downloads would still complete, but clients would spend more time asking |
|---|
| 103 | other servers before querying the correct one. |
|---|
| 104 | |
|---|
| 105 | In the old 1.9 code, the permutation-seed was always equal to the TubID. |
|---|
| 106 | In 1.10, servers include their permutation-seed as part of their |
|---|
| 107 | announcement. To improve stability for existing grids, if an old server |
|---|
| 108 | (one with existing shares) is upgraded to run the 1.10 codebase, it will |
|---|
| 109 | use its old TubID as its permutation-seed. When a new empty server runs |
|---|
| 110 | the 1.10 code, it will use its Node ID instead. In both cases, once the |
|---|
| 111 | node has picked a permutation-seed, it will continue using that value |
|---|
| 112 | forever. |
|---|
| 113 | |
|---|
| 114 | To be specific, when a node wakes up running the 1.10 code, it will look |
|---|
| 115 | for a recorded NODEDIR/permutation-seed file, and use its contents if |
|---|
| 116 | present. If that file does not exist, it creates it (with the TubID if |
|---|
| 117 | it has any shares, otherwise with the Node ID), and uses the contents as |
|---|
| 118 | the permutation-seed. |
|---|
| 119 | |
|---|
| 120 | There is one unfortunate consequence of this pattern. If new 1.10 server |
|---|
| 121 | is created in a grid that has an old client, or has a new client but an |
|---|
| 122 | old Introducer, then that client will see downgraded non-signed |
|---|
| 123 | announcements, and thus will first upload shares with the TubID-based |
|---|
| 124 | permutation-seed. Later, when the client and/or Introducer is upgraded, |
|---|
| 125 | the client will start seeing signed announcements with the NodeID-based |
|---|
| 126 | permutation-seed, and will then look for shares in the wrong place. This |
|---|
| 127 | will hurt performance in a large grid, but should not affect |
|---|
| 128 | reliability. This effect shouldn't even be noticeable in grids for which |
|---|
| 129 | the number of servers is close to the "N" shares.total number (e.g. |
|---|
| 130 | where num-servers < 3*N). And the as-yet-unimplemented "share |
|---|
| 131 | rebalancing" feature should repair the misplacement. |
|---|
| 132 | |
|---|
| 133 | If you wish to avoid this effect, try to upgrade both Introducers and |
|---|
| 134 | clients at about the same time. (Upgrading servers does not matter: they |
|---|
| 135 | will continue to use the old permutation-seed). |
|---|