| 1 | The Debian OpenSSL bug that was announced last week has some effects on |
| 2 | Foolscap security, detailed by the Foolscap trac page: |
| 3 | |
| 4 | http://foolscap.lothar.com/trac/wiki/DebianOpenSslBug |
| 5 | |
| 6 | Now, what are the consequences for Tahoe? |
| 7 | |
| 8 | In summary: not very severe. Once you've upgraded to the fixed openssl |
| 9 | library, the lingering effects of weak keys are (starting with the most |
| 10 | severe): |
| 11 | |
| 12 | 1. a successful Man-in-the-middle attack could allow the attacker to delete |
| 13 | (or roll back) mutable file shares for which they do not have the |
| 14 | write-cap. |
| 15 | 2. clients who were not given the introducer.furl could use a MitM attack to |
| 16 | connect to the introducer anyway, and from there get access to storage |
| 17 | servers |
| 18 | 3. clients who were not given a helper.furl could use a MitM attack to |
| 19 | connect to (and use) a helper process |
| 20 | 4. clients who were not given a key-generator.furl could use a MitM attack |
| 21 | to connect to (and drain the keys out of) a key generator. This is a DoS |
| 22 | attack only. |
| 23 | 5. attackers could mount a MitM attack between a node and its log-gatherer, |
| 24 | allowing the attacker to view the node's logs (which contain no secrets, |
| 25 | but which would assist a traffic-analysis attack) |
| 26 | |
| 27 | The only vulnerable component of Tahoe is the Foolscap TubID. All other |
| 28 | random numbers are either generated by Crypto++ or by calling os.urandom() |
| 29 | (which uses the kernel's /dev/urandom RNG): this includes the AES and RSA |
| 30 | keys used for write-caps, and the unguessable swissnums used to grant access |
| 31 | to Referenceables. |
| 32 | |
| 33 | Tahoe benefits immensely from its conservative "trust nobody" design: none of |
| 34 | the important secrets leave the user's computer. We were somewhat lucky that |
| 35 | openssl was not used to generate any of thse important secrets. The remaining |
| 36 | problems are described below. |
| 37 | |
| 38 | == Mutable File Share write-secrets == |
| 39 | |
| 40 | The authority to modify a mutable file is expressed in its "write-cap", which |
| 41 | includes enough information to obtain an RSA private (signing) key. Anyone |
| 42 | who can sign shares with the right key will be able to modify the file any |
| 43 | way they please. |
| 44 | |
| 45 | These shares are stored on untrusted servers, who could damage or delete them |
| 46 | (since there are extensive cryptographic hashes checked on each share, |
| 47 | culminating in the RSA signature, damaging a share is equivalent to deleting |
| 48 | it). The servers could also "roll back" the share to an earlier state. If |
| 49 | enough servers do this, a client could see the file revert back to an earlier |
| 50 | version. Rollback is the one way in which the servers can extert a form of |
| 51 | "write authority" over a mutable file. Other parties are not supposed to have |
| 52 | any such power. |
| 53 | |
| 54 | To reduce storage server workload, and to reduce version dependencies, the |
| 55 | servers do not actually check this signature at upload/modify time (clients |
| 56 | who are downloading the mutable file are the only ones who check it). |
| 57 | Instead, when the mutable file's shares are created for the first time, the |
| 58 | original uploader creates a set of "write secrets", one for each server, |
| 59 | which are derived from the hash of the write-cap and the server's peerid. The |
| 60 | server will accept an update from anyone who can provide the same secret. |
| 61 | These secrets are different for each server, so serverA has no authority over |
| 62 | a different share of the same file on serverB. |
| 63 | |
| 64 | Since these shared secrets are sent over the Foolscap connection with no |
| 65 | further encryption, a successful MitM attack (accomplished against a storage |
| 66 | server that uses a Tub certificate generated by the buggy version of OpenSSL) |
| 67 | could reveal these secrets to the attacker. This attacker would then get the |
| 68 | authority to make changes to those shares. They would be unable to forge |
| 69 | valid signatures, so they would be limited to the same deletion-or-rollback |
| 70 | attacks that the server could perform. They could only perform these attacks |
| 71 | on the servers that had weak Tub certificates. |
| 72 | |
| 73 | == Unauthorized Access To introducer/helper/key-generator == |
| 74 | |
| 75 | Several configuration controls use FURLs to provide/limit access to certain |
| 76 | grid services. The main one is the introducer.furl : clients use this to |
| 77 | contact the Introducer, from which they get access to all storage servers. In |
| 78 | the current release, access to the storage servers can be withheld by not |
| 79 | publishing the introducer.furl . (we plan to change this: once Accounting is |
| 80 | in place, the introducer will be more public, and access to storage servers |
| 81 | will be controlled by a signed and authorized private key). |
| 82 | |
| 83 | If the Introducer was created with the buggy version of openssl, its TubID |
| 84 | will be guessable. This enables a man-in-the-middle attack between an |
| 85 | authorized client and the Introducer, from which the attacker can learn the |
| 86 | unguessable swissnum that protects access to the Introducer. A successful |
| 87 | attack would thus allow an unauthorized party to connect to the Introducer |
| 88 | and therefore use storage services. |
| 89 | |
| 90 | Similarly, access to the Helper and the Key-Generator is enabled/protected by |
| 91 | distributing FURLs, and when these FURLs use guessable Tub certificates, an |
| 92 | attacker will be able to perform a successful MitM attack against a user of |
| 93 | the service. From this, the attacker can learn the swissnum, and thus gain |
| 94 | access to the service. |
| 95 | |
| 96 | Unauthorized access to the Helper means the attacker gets to upload files and |
| 97 | consume the Helper's CPU time (which may have been intended to be reserved |
| 98 | for paying customers). |
| 99 | |
| 100 | The "key generator" is a small process that creates RSA keypairs, intended to |
| 101 | offload mutable file creation work from a webapi server. (the RSA key |
| 102 | generation process involves 0.5s to 3.0s of blocking CPU time, so the webapi |
| 103 | machine's responsiveness to other requests is improved by passing the work to |
| 104 | a separate process). It pre-generates a small pool of keys to respond faster. |
| 105 | An attacker who uses an MitM attack to gain access to the key generator could |
| 106 | request a lot of keys, causing extra CPU load and draining this pool, which |
| 107 | would slow down legitimate requests. |
| 108 | |
| 109 | == Log Gatherer == |
| 110 | |
| 111 | Tahoe nodes can be configured with a log-gatherer.furl, which directs the |
| 112 | node to connect to the given gatherer and offer its "log port". The log port |
| 113 | can be used to retrieve stored log messages, and to subscribe to new ones. |
| 114 | Grid managers can use this to record verbose information about uploads and |
| 115 | downloads. |
| 116 | |
| 117 | If the log-gatherer is using a weak Tub certificate, an attacker could mount |
| 118 | a successfuly MitM attack between the node and the gatherer, revealing the |
| 119 | swissnum of the node's logport. This would allow the attacker to see the same |
| 120 | log messages that the gatherer sees. |
| 121 | |
| 122 | By design, Tahoe nodes do not log secrets. Instead, most upload/download |
| 123 | operations refer to the Storage Index of the file being processed, which is |
| 124 | public information (storage servers and several diagnostic web pages show the |
| 125 | SI values). However, the logs do contain file sizes, and the information |
| 126 | therein would be useful to an attacker interested in performing a |
| 127 | traffic-analysis attack: it could help them learn who is interested in the |
| 128 | same file, or who is downloading a file that someone else uploaded. So, while |
| 129 | it does not threaten data confidentiality or integrity, you still wouldn't |
| 130 | want to publish logs to the world, which is why the log-gatherer.furl is |
| 131 | meant to control how this gets published. |
| 132 | |
| 133 | == Fixing The Problems == |
| 134 | |
| 135 | To fix these problems, server operators need to regenerate any Tub |
| 136 | certificates that were created while the buggy version of openssl was |
| 137 | installed. However, there are several operational problems that may make this |
| 138 | more difficult than it sounds. |
| 139 | |
| 140 | * introducer.furl: All clients need to be updated with the new FURL, which may |
| 141 | require touching hundreds of client machines. Since the Introducer FURL is |
| 142 | the primary entry point, Tahoe does not have a mechanism to automatically |
| 143 | update it from some other server. |
| 144 | * helper.furl: same problem. Eventually, Helpers will be accessed through the |
| 145 | Introducer, but in the current release, the helper is configured by writing |
| 146 | to the helper.furl file, so it must be updated as well |
| 147 | * storage servers: Storage Server FURLs are distributed through the |
| 148 | Introducer, so it would seem straightforward to delete the server's |
| 149 | "node.pem" file, restart it, and allow it to generate a new one: the server |
| 150 | would connect to the introducer and appear as a brand new server (that |
| 151 | happens to have the same shares as it did before). |
| 152 | * However, there are two problems that will result if this is done with the |
| 153 | current release. The most significant is that clients use shared secrets |
| 154 | derived partially from the storage server's TubID. The most important one |
| 155 | is the mutable-share write-secret, which allows clients to modify mutable |
| 156 | files (including modifying directories). If the storage server's TubID no |
| 157 | longer matches the secret that was stored in the share, then clients will |
| 158 | get errors when they attempt to modify those shares. In many cases, this |
| 159 | will prevent users from modifying their directories. |
| 160 | * There are plans to fix this: the error message includes the TubID that was |
| 161 | used to generate the secret, so the plan is to add a storage API that |
| 162 | allows the client to change the shared secret (by providing both the old |
| 163 | one and the new one). This will allow clients to tolerate shares being |
| 164 | moved from one server to another, which would be the effect of |
| 165 | regenerating the Tub certificate for those storage servers. |
| 166 | * The second problem is that the peer selection algorithm would now see |
| 167 | shares in non-optimal places. This would look a lot like large-scale |
| 168 | churn: shares being moved to random servers, not necessarily the same |
| 169 | servers that the node would expect to find them on. The peer selection |
| 170 | algorithm is designed to tolerate this, but the effect will be a |
| 171 | slowdown: nodes will be looking for their shares in the wrong place, so |
| 172 | they'll have to search further than usual, and this will take additional |
| 173 | round trips. So changing the server's TubIDs will also affect client |
| 174 | download performance. To address this, a file-repair step that moves |
| 175 | shares to their ideal locations needs to be written. |