Opened at 2010-01-09T04:41:24Z
Closed at 2013-01-14T09:05:31Z
#887 closed defect (duplicate)
twisted.web logs the uri on some exceptional conditions, leading to a privacy leak in logfiles
Reported by: | zooko | Owned by: | nobody |
---|---|---|---|
Priority: | major | Milestone: | undecided |
Component: | code-frontend-web | Version: | 1.4.1 |
Keywords: | confidentiality privacy logging | Cc: | |
Launchpad Bug: |
Description
We have a policy of not logging filenames or caps into our logging system. This is very useful, because then users who want to report a problem can send us their log files, or let us connect a foolscap log watcher tool to their running Tahoe-LAFS node, without exposing their filenames or capabilities to us tahoe-lafs developers. However, I just noticed that twisted.web logs the URI in some error cases, which means the twistd.log file can have these privacy-sensitive strings in it. I noticed because I was looking at a twistd.log file and it said:
2009-12-17 07:59:14.525Z [HTTPChannel,162,207.7.153.173] Unhandled Error Traceback (most recent call last): Failure: exceptions.RuntimeError: Producer was not unregistered for /uri/URI:CHK:dskdfkdsfdsf:skjhfsdfhdafkjhdskfjhskjdfhskjfhdksjhfkshf:3:10:6069379?save=true&filename=02.%E5%B7%AE%E4%B8%8D%E5%A4%9A%E5%85%88%E7%94%9F.mp3
(Actually I censored the cap itself when posting this ticket.)
Here is the twisted.web line that logs the uri:
http://twistedmatrix.com/trac/browser/trunk/twisted/web/http.py?rev=27335#L591
The error that is triggering this log message is #685 (RuntimeError: Producer was not unregistered), although there may well be other exceptional conditions that we might sometimes hit that could stimulate twisted to log the URI.
We have hitherto been treating the twistd.log file as a log file, potentially a source of useful diagnostic information, and inviting users to send theirs to us if they have problems. I guess in the short term we should stop doing that, although that could make it impossible to diagnose some things. In the long term we should systematically fix privacy and confidentiality leaks like this. (Also we should get rid of the twistd.log file entirely and make all logging go through the foolscap system. That is probably orthogonal to this ticket though.)
This was with the following versions of software:
Nevow: 0.9.26 Twisted: 2.5.0 argparse: 0.8.0 foolscap: 0.4.2 platform: Linux-Ubuntu_8.04-i686-32bit pyOpenSSL: 0.6 pycryptopp: 0.5.16-r669 python: 2.5.2 pyutil: 1.3.20 setuptools: 0.6c8 simplejson: 1.7.3 tahoe-server: 1.4.1 twisted: 2.5.0 z-base-32: 1.0.1 zfec: 1.4.0-4 zope.interface: 3.3.1
Change History (3)
comment:1 Changed at 2010-01-10T08:23:27Z by warner
comment:2 Changed at 2010-02-01T19:59:44Z by davidsarah
- Component changed from unknown to code-frontend-web
comment:3 Changed at 2013-01-14T09:05:31Z by zooko
- Resolution set to duplicate
- Status changed from new to closed
duplicate of #685
one idea: we could have our web Request handler erase request.uri, or censor it. If this happens after .uri has been parsed into components and query strings, then I don't think any control flow will be affected, but all log messages should emit the censored string instead of the original.
This would probably go into allmydata.webish.MyRequest.requestReceived, right after the last usage of self.uri.