#887 closed defect (duplicate)

twisted.web logs the uri on some exceptional conditions, leading to a privacy leak in logfiles

Reported by: zooko Owned by: nobody
Priority: major Milestone: undecided
Component: code-frontend-web Version: 1.4.1
Keywords: confidentiality privacy logging Cc:
Launchpad Bug:

Description

We have a policy of not logging filenames or caps into our logging system. This is very useful, because then users who want to report a problem can send us their log files, or let us connect a foolscap log watcher tool to their running Tahoe-LAFS node, without exposing their filenames or capabilities to us tahoe-lafs developers. However, I just noticed that twisted.web logs the URI in some error cases, which means the twistd.log file can have these privacy-sensitive strings in it. I noticed because I was looking at a twistd.log file and it said:

2009-12-17 07:59:14.525Z [HTTPChannel,162,207.7.153.173] Unhandled Error
        Traceback (most recent call last):
        Failure: exceptions.RuntimeError: Producer was not unregistered for /uri/URI:CHK:dskdfkdsfdsf:skjhfsdfhdafkjhdskfjhskjdfhskjfhdksjhfkshf:3:10:6069379?save=true&filename=02.%E5%B7%AE%E4%B8%8D%E5%A4%9A%E5%85%88%E7%94%9F.mp3

(Actually I censored the cap itself when posting this ticket.)

Here is the twisted.web line that logs the uri:

http://twistedmatrix.com/trac/browser/trunk/twisted/web/http.py?rev=27335#L591

The error that is triggering this log message is #685 (RuntimeError: Producer was not unregistered), although there may well be other exceptional conditions that we might sometimes hit that could stimulate twisted to log the URI.

We have hitherto been treating the twistd.log file as a log file, potentially a source of useful diagnostic information, and inviting users to send theirs to us if they have problems. I guess in the short term we should stop doing that, although that could make it impossible to diagnose some things. In the long term we should systematically fix privacy and confidentiality leaks like this. (Also we should get rid of the twistd.log file entirely and make all logging go through the foolscap system. That is probably orthogonal to this ticket though.)

This was with the following versions of software:

         Nevow: 0.9.26
       Twisted: 2.5.0
      argparse: 0.8.0
      foolscap: 0.4.2
      platform: Linux-Ubuntu_8.04-i686-32bit
     pyOpenSSL: 0.6
    pycryptopp: 0.5.16-r669
        python: 2.5.2
        pyutil: 1.3.20
    setuptools: 0.6c8
    simplejson: 1.7.3
  tahoe-server: 1.4.1
       twisted: 2.5.0
     z-base-32: 1.0.1
          zfec: 1.4.0-4
zope.interface: 3.3.1

Change History (3)

comment:1 Changed at 2010-01-10T08:23:27Z by warner

one idea: we could have our web Request handler erase request.uri, or censor it. If this happens after .uri has been parsed into components and query strings, then I don't think any control flow will be affected, but all log messages should emit the censored string instead of the original.

This would probably go into allmydata.webish.MyRequest.requestReceived, right after the last usage of self.uri.

comment:2 Changed at 2010-02-01T19:59:44Z by davidsarah

  • Component changed from unknown to code-frontend-web

comment:3 Changed at 2013-01-14T09:05:31Z by zooko

  • Resolution set to duplicate
  • Status changed from new to closed

duplicate of #685

Note: See TracTickets for help on using tickets.