wiki:TahoeVsDebianBuggyOpenSsl

Version 1 (modified by warner, at 2008-05-22T21:12:51Z) (diff)

describe the effects of the debian SSL bug on Tahoe

The Debian OpenSSL bug that was announced last week has some effects on Foolscap security, detailed by the Foolscap trac page:

http://foolscap.lothar.com/trac/wiki/DebianOpenSslBug

Now, what are the consequences for Tahoe?

In summary: not very severe. Once you've upgraded to the fixed openssl library, the lingering effects of weak keys are (starting with the most severe):

  1. a successful Man-in-the-middle attack could allow the attacker to delete (or roll back) mutable file shares for which they do not have the write-cap.
  2. clients who were not given the introducer.furl could use a MitM attack to connect to the introducer anyway, and from there get access to storage servers
  3. clients who were not given a helper.furl could use a MitM attack to connect to (and use) a helper process
  4. clients who were not given a key-generator.furl could use a MitM attack to connect to (and drain the keys out of) a key generator. This is a DoS attack only.
  5. attackers could mount a MitM attack between a node and its log-gatherer, allowing the attacker to view the node's logs (which contain no secrets, but which would assist a traffic-analysis attack)

The only vulnerable component of Tahoe is the Foolscap TubID. All other random numbers are either generated by Crypto++ or by calling os.urandom() (which uses the kernel's /dev/urandom RNG): this includes the AES and RSA keys used for write-caps, and the unguessable swissnums used to grant access to Referenceables.

Tahoe benefits immensely from its conservative "trust nobody" design: none of the important secrets leave the user's computer. We were somewhat lucky that openssl was not used to generate any of thse important secrets. The remaining problems are described below.

Mutable File Share write-secrets

The authority to modify a mutable file is expressed in its "write-cap", which includes enough information to obtain an RSA private (signing) key. Anyone who can sign shares with the right key will be able to modify the file any way they please.

These shares are stored on untrusted servers, who could damage or delete them (since there are extensive cryptographic hashes checked on each share, culminating in the RSA signature, damaging a share is equivalent to deleting it). The servers could also "roll back" the share to an earlier state. If enough servers do this, a client could see the file revert back to an earlier version. Rollback is the one way in which the servers can extert a form of "write authority" over a mutable file. Other parties are not supposed to have any such power.

To reduce storage server workload, and to reduce version dependencies, the servers do not actually check this signature at upload/modify time (clients who are downloading the mutable file are the only ones who check it). Instead, when the mutable file's shares are created for the first time, the original uploader creates a set of "write secrets", one for each server, which are derived from the hash of the write-cap and the server's peerid. The server will accept an update from anyone who can provide the same secret. These secrets are different for each server, so serverA has no authority over a different share of the same file on serverB.

Since these shared secrets are sent over the Foolscap connection with no further encryption, a successful MitM attack (accomplished against a storage server that uses a Tub certificate generated by the buggy version of OpenSSL) could reveal these secrets to the attacker. This attacker would then get the authority to make changes to those shares. They would be unable to forge valid signatures, so they would be limited to the same deletion-or-rollback attacks that the server could perform. They could only perform these attacks on the servers that had weak Tub certificates.

Unauthorized Access To introducer/helper/key-generator

Several configuration controls use FURLs to provide/limit access to certain grid services. The main one is the introducer.furl : clients use this to contact the Introducer, from which they get access to all storage servers. In the current release, access to the storage servers can be withheld by not publishing the introducer.furl . (we plan to change this: once Accounting is in place, the introducer will be more public, and access to storage servers will be controlled by a signed and authorized private key).

If the Introducer was created with the buggy version of openssl, its TubID will be guessable. This enables a man-in-the-middle attack between an authorized client and the Introducer, from which the attacker can learn the unguessable swissnum that protects access to the Introducer. A successful attack would thus allow an unauthorized party to connect to the Introducer and therefore use storage services.

Similarly, access to the Helper and the Key-Generator is enabled/protected by distributing FURLs, and when these FURLs use guessable Tub certificates, an attacker will be able to perform a successful MitM attack against a user of the service. From this, the attacker can learn the swissnum, and thus gain access to the service.

Unauthorized access to the Helper means the attacker gets to upload files and consume the Helper's CPU time (which may have been intended to be reserved for paying customers).

The "key generator" is a small process that creates RSA keypairs, intended to offload mutable file creation work from a webapi server. (the RSA key generation process involves 0.5s to 3.0s of blocking CPU time, so the webapi machine's responsiveness to other requests is improved by passing the work to a separate process). It pre-generates a small pool of keys to respond faster. An attacker who uses an MitM attack to gain access to the key generator could request a lot of keys, causing extra CPU load and draining this pool, which would slow down legitimate requests.

Log Gatherer

Tahoe nodes can be configured with a log-gatherer.furl, which directs the node to connect to the given gatherer and offer its "log port". The log port can be used to retrieve stored log messages, and to subscribe to new ones. Grid managers can use this to record verbose information about uploads and downloads.

If the log-gatherer is using a weak Tub certificate, an attacker could mount a successfuly MitM attack between the node and the gatherer, revealing the swissnum of the node's logport. This would allow the attacker to see the same log messages that the gatherer sees.

By design, Tahoe nodes do not log secrets. Instead, most upload/download operations refer to the Storage Index of the file being processed, which is public information (storage servers and several diagnostic web pages show the SI values). However, the logs do contain file sizes, and the information therein would be useful to an attacker interested in performing a traffic-analysis attack: it could help them learn who is interested in the same file, or who is downloading a file that someone else uploaded. So, while it does not threaten data confidentiality or integrity, you still wouldn't want to publish logs to the world, which is why the log-gatherer.furl is meant to control how this gets published.

Fixing The Problems

To fix these problems, server operators need to regenerate any Tub certificates that were created while the buggy version of openssl was installed. However, there are several operational problems that may make this more difficult than it sounds.

  • introducer.furl: All clients need to be updated with the new FURL, which may require touching hundreds of client machines. Since the Introducer FURL is the primary entry point, Tahoe does not have a mechanism to automatically update it from some other server.
  • helper.furl: same problem. Eventually, Helpers will be accessed through the Introducer, but in the current release, the helper is configured by writing to the helper.furl file, so it must be updated as well
  • storage servers: Storage Server FURLs are distributed through the Introducer, so it would seem straightforward to delete the server's "node.pem" file, restart it, and allow it to generate a new one: the server would connect to the introducer and appear as a brand new server (that happens to have the same shares as it did before).
    • However, there are two problems that will result if this is done with the current release. The most significant is that clients use shared secrets derived partially from the storage server's TubID. The most important one is the mutable-share write-secret, which allows clients to modify mutable files (including modifying directories). If the storage server's TubID no longer matches the secret that was stored in the share, then clients will get errors when they attempt to modify those shares. In many cases, this will prevent users from modifying their directories.
    • There are plans to fix this: the error message includes the TubID that was used to generate the secret, so the plan is to add a storage API that allows the client to change the shared secret (by providing both the old one and the new one). This will allow clients to tolerate shares being moved from one server to another, which would be the effect of regenerating the Tub certificate for those storage servers.
    • The second problem is that the peer selection algorithm would now see shares in non-optimal places. This would look a lot like large-scale churn: shares being moved to random servers, not necessarily the same servers that the node would expect to find them on. The peer selection algorithm is designed to tolerate this, but the effect will be a slowdown: nodes will be looking for their shares in the wrong place, so they'll have to search further than usual, and this will take additional round trips. So changing the server's TubIDs will also affect client download performance. To address this, a file-repair step that moves shares to their ideal locations needs to be written.