Changes between Initial Version and Version 1 of TahoeVsDebianBuggyOpenSsl


Ignore:
Timestamp:
2008-05-22T21:12:51Z (16 years ago)
Author:
warner
Comment:

describe the effects of the debian SSL bug on Tahoe

Legend:

Unmodified
Added
Removed
Modified
  • TahoeVsDebianBuggyOpenSsl

    v1 v1  
     1The Debian OpenSSL bug that was announced last week has some effects on
     2Foolscap security, detailed by the Foolscap trac page:
     3
     4 http://foolscap.lothar.com/trac/wiki/DebianOpenSslBug
     5
     6Now, what are the consequences for Tahoe?
     7
     8In summary: not very severe. Once you've upgraded to the fixed openssl
     9library, the lingering effects of weak keys are (starting with the most
     10severe):
     11
     12 1. a successful Man-in-the-middle attack could allow the attacker to delete
     13    (or roll back) mutable file shares for which they do not have the
     14    write-cap.
     15 2. clients who were not given the introducer.furl could use a MitM attack to
     16    connect to the introducer anyway, and from there get access to storage
     17    servers
     18 3. clients who were not given a helper.furl could use a MitM attack to
     19    connect to (and use) a helper process
     20 4. clients who were not given a key-generator.furl could use a MitM attack
     21    to connect to (and drain the keys out of) a key generator. This is a DoS
     22    attack only.
     23 5. attackers could mount a MitM attack between a node and its log-gatherer,
     24    allowing the attacker to view the node's logs (which contain no secrets,
     25    but which would assist a traffic-analysis attack)
     26
     27The only vulnerable component of Tahoe is the Foolscap TubID. All other
     28random numbers are either generated by Crypto++ or by calling os.urandom()
     29(which uses the kernel's /dev/urandom RNG): this includes the AES and RSA
     30keys used for write-caps, and the unguessable swissnums used to grant access
     31to Referenceables.
     32
     33Tahoe benefits immensely from its conservative "trust nobody" design: none of
     34the important secrets leave the user's computer. We were somewhat lucky that
     35openssl was not used to generate any of thse important secrets. The remaining
     36problems are described below.
     37
     38== Mutable File Share write-secrets ==
     39
     40The authority to modify a mutable file is expressed in its "write-cap", which
     41includes enough information to obtain an RSA private (signing) key. Anyone
     42who can sign shares with the right key will be able to modify the file any
     43way they please.
     44
     45These shares are stored on untrusted servers, who could damage or delete them
     46(since there are extensive cryptographic hashes checked on each share,
     47culminating in the RSA signature, damaging a share is equivalent to deleting
     48it). The servers could also "roll back" the share to an earlier state. If
     49enough servers do this, a client could see the file revert back to an earlier
     50version. Rollback is the one way in which the servers can extert a form of
     51"write authority" over a mutable file. Other parties are not supposed to have
     52any such power.
     53
     54To reduce storage server workload, and to reduce version dependencies, the
     55servers do not actually check this signature at upload/modify time (clients
     56who are downloading the mutable file are the only ones who check it).
     57Instead, when the mutable file's shares are created for the first time, the
     58original uploader creates a set of "write secrets", one for each server,
     59which are derived from the hash of the write-cap and the server's peerid. The
     60server will accept an update from anyone who can provide the same secret.
     61These secrets are different for each server, so serverA has no authority over
     62a different share of the same file on serverB.
     63
     64Since these shared secrets are sent over the Foolscap connection with no
     65further encryption, a successful MitM attack (accomplished against a storage
     66server that uses a Tub certificate generated by the buggy version of OpenSSL)
     67could reveal these secrets to the attacker. This attacker would then get the
     68authority to make changes to those shares. They would be unable to forge
     69valid signatures, so they would be limited to the same deletion-or-rollback
     70attacks that the server could perform. They could only perform these attacks
     71on the servers that had weak Tub certificates.
     72
     73== Unauthorized Access To introducer/helper/key-generator ==
     74
     75Several configuration controls use FURLs to provide/limit access to certain
     76grid services. The main one is the introducer.furl : clients use this to
     77contact the Introducer, from which they get access to all storage servers. In
     78the current release, access to the storage servers can be withheld by not
     79publishing the introducer.furl . (we plan to change this: once Accounting is
     80in place, the introducer will be more public, and access to storage servers
     81will be controlled by a signed and authorized private key).
     82
     83If the Introducer was created with the buggy version of openssl, its TubID
     84will be guessable. This enables a man-in-the-middle attack between an
     85authorized client and the Introducer, from which the attacker can learn the
     86unguessable swissnum that protects access to the Introducer. A successful
     87attack would thus allow an unauthorized party to connect to the Introducer
     88and therefore use storage services.
     89
     90Similarly, access to the Helper and the Key-Generator is enabled/protected by
     91distributing FURLs, and when these FURLs use guessable Tub certificates, an
     92attacker will be able to perform a successful MitM attack against a user of
     93the service. From this, the attacker can learn the swissnum, and thus gain
     94access to the service.
     95
     96Unauthorized access to the Helper means the attacker gets to upload files and
     97consume the Helper's CPU time (which may have been intended to be reserved
     98for paying customers).
     99
     100The "key generator" is a small process that creates RSA keypairs, intended to
     101offload mutable file creation work from a webapi server. (the RSA key
     102generation process involves 0.5s to 3.0s of blocking CPU time, so the webapi
     103machine's responsiveness to other requests is improved by passing the work to
     104a separate process). It pre-generates a small pool of keys to respond faster.
     105An attacker who uses an MitM attack to gain access to the key generator could
     106request a lot of keys, causing extra CPU load and draining this pool, which
     107would slow down legitimate requests.
     108
     109== Log Gatherer ==
     110
     111Tahoe nodes can be configured with a log-gatherer.furl, which directs the
     112node to connect to the given gatherer and offer its "log port". The log port
     113can be used to retrieve stored log messages, and to subscribe to new ones.
     114Grid managers can use this to record verbose information about uploads and
     115downloads.
     116
     117If the log-gatherer is using a weak Tub certificate, an attacker could mount
     118a successfuly MitM attack between the node and the gatherer, revealing the
     119swissnum of the node's logport. This would allow the attacker to see the same
     120log messages that the gatherer sees.
     121
     122By design, Tahoe nodes do not log secrets. Instead, most upload/download
     123operations refer to the Storage Index of the file being processed, which is
     124public information (storage servers and several diagnostic web pages show the
     125SI values). However, the logs do contain file sizes, and the information
     126therein would be useful to an attacker interested in performing a
     127traffic-analysis attack: it could help them learn who is interested in the
     128same file, or who is downloading a file that someone else uploaded. So, while
     129it does not threaten data confidentiality or integrity, you still wouldn't
     130want to publish logs to the world, which is why the log-gatherer.furl is
     131meant to control how this gets published.
     132
     133== Fixing The Problems ==
     134
     135To fix these problems, server operators need to regenerate any Tub
     136certificates that were created while the buggy version of openssl was
     137installed. However, there are several operational problems that may make this
     138more difficult than it sounds.
     139
     140 * introducer.furl: All clients need to be updated with the new FURL, which may
     141   require touching hundreds of client machines. Since the Introducer FURL is
     142   the primary entry point, Tahoe does not have a mechanism to automatically
     143   update it from some other server.
     144 * helper.furl: same problem. Eventually, Helpers will be accessed through the
     145   Introducer, but in the current release, the helper is configured by writing
     146   to the helper.furl file, so it must be updated as well
     147 * storage servers: Storage Server FURLs are distributed through the
     148   Introducer, so it would seem straightforward to delete the server's
     149   "node.pem" file, restart it, and allow it to generate a new one: the server
     150   would connect to the introducer and appear as a brand new server (that
     151   happens to have the same shares as it did before).
     152  * However, there are two problems that will result if this is done with the
     153    current release. The most significant is that clients use shared secrets
     154    derived partially from the storage server's TubID. The most important one
     155    is the mutable-share write-secret, which allows clients to modify mutable
     156    files (including modifying directories). If the storage server's TubID no
     157    longer matches the secret that was stored in the share, then clients will
     158    get errors when they attempt to modify those shares. In many cases, this
     159    will prevent users from modifying their directories.
     160  * There are plans to fix this: the error message includes the TubID that was
     161    used to generate the secret, so the plan is to add a storage API that
     162    allows the client to change the shared secret (by providing both the old
     163    one and the new one). This will allow clients to tolerate shares being
     164    moved from one server to another, which would be the effect of
     165    regenerating the Tub certificate for those storage servers.
     166  * The second problem is that the peer selection algorithm would now see
     167    shares in non-optimal places. This would look a lot like large-scale
     168    churn: shares being moved to random servers, not necessarily the same
     169    servers that the node would expect to find them on. The peer selection
     170    algorithm is designed to tolerate this, but the effect will be a
     171    slowdown: nodes will be looking for their shares in the wrong place, so
     172    they'll have to search further than usual, and this will take additional
     173    round trips. So changing the server's TubIDs will also affect client
     174    download performance. To address this, a file-repair step that moves
     175    shares to their ideal locations needs to be written.