Opened at 2008-12-23T14:36:50Z
Last modified at 2014-08-27T04:44:46Z
#562 new defect
add a "censor" command to filter out sensitive information from log files — at Version 18
Reported by: | zooko | Owned by: | somebody |
---|---|---|---|
Priority: | major | Milestone: | eventually |
Component: | code | Version: | 1.2.0 |
Keywords: | privacy logging confidentiality | Cc: | |
Launchpad Bug: |
Description (last modified by zooko)
Change History (20)
comment:1 Changed at 2009-11-01T02:04:45Z by davidsarah
comment:2 Changed at 2009-12-20T23:41:01Z by davidsarah
- Keywords privacy added
If you like this bug, you might also like #860.
comment:3 Changed at 2010-02-01T19:51:37Z by davidsarah
- Keywords logging added
- Milestone changed from undecided to 1.7.0
comment:4 Changed at 2010-02-21T20:29:58Z by kevan
- Owner changed from somebody to kevan
comment:5 Changed at 2010-02-23T01:25:02Z by kevan
First, note that the log file that inspired this ticket is here: pipermail/tahoe-dev/attachments/20081222/20cc919e/attachment-0001.html
The tahoe-lafs code itself, unless I'm missing something, doesn't ever print the introducer_furl to a log. I notice that there's one exception in there with a censored furl; perhaps that's an artifact from how things were then, or something that foolscap is doing? I'll look into that more thoroughly later.
I do notice that the storage server furls are also censored in the motivating log file. I don't mind having them there in my log files, and, as Zooko points out in that thread, censoring too much makes the log files less useful. Maybe this can be a configuration switch -- if paranoid logging is turned on, then IP addresses, storage server furls, storage indices/verify caps are censored somehow, and if not they aren't.
comment:6 Changed at 2010-02-23T03:43:22Z by kevan
..alternatively, maybe there's a way that we could add a tool to censor logs after they've been created.
For example, you can do
flogtool filter --after=5 logs/from-2010-02-21-124158--to-present.flog filtered.flog
to post-process logs that way. So maybe you could, if you wanted a censored log snippet to post to tahoe-dev or on the Trac, do something like
flogtool censor logs/from-2010-02-21-124158--to-present.flog censored.log
and have flogtool (or whatever) obfuscate the SIs, furls, and so on. Of course, it's probably much harder to do it that way.
Censorship in a running node is relatively easy, as you can easily determine what is what as it is being logged, and censor accordingly. Censorship after the fact is much harder, because you need to be able to reliably determine whether a certain string is a furl, a storage index, an IP address, something else that should be censored, or nothing at all. It seems to be closer to what I as a user would want, though; if I want to have a useful, low-effort log to attach to a bug report, I shouldn't have to run my node such that it never produces logs with information that might help me later, nor should I have to stop, reconfigure, and restart my node, then hope that the problem reappears.
comment:7 Changed at 2010-04-05T12:23:40Z by francois
Kevan,
I like your idea of creating a new 'flogtool censor' command.
What about tagging potentially sensitive informations at logging time? For example, let's modify this type of log line
connectTCP to ('127.0.0.1', 55368)
into
connectTCP to ('<IP>127.0.0.1</IP>', 55368)
It will then by pretty easy to filter out IP addresses, furls, storage indexes and so on.
comment:8 Changed at 2010-04-13T22:53:46Z by kevan
- Status changed from new to assigned
That would solve the problem.
I haven't had much time to play with the censorer lately, but it's more or less functional now, with that idea. I'm hoping I can have some patches and tests for people to play with by the end of this weekend.
comment:9 Changed at 2010-05-01T23:48:24Z by kevan
A correct solution to this will probably need to be implemented in foolscap, since it turns out that a lot of the compromising log entries come from there.
David-Sarah suggested that foolscap could offer callers of its logging system a way to mark certain log messages (or certain parts of certain log messages) as sensitive, so flogtool censor or whatever would know to censor them. For example,
from foolscap.logging import log [...] log.msg("some stuff" + log.sensitive("sensitive information")
You'd basically need to do the following to solve this ticket, if you wanted to do it as above:
- Decide how to represent sensitive information in foolscap logs, and implement the sensitive function.
- Implement flogtool censor.
- Go through and audit logging code in foolscap and tahoe-lafs so that it uses sensitive where appropriate.
- Make patches for your changes and get them accepted into foolscap and tahoe-lafs.
Between GSoC and school, I'm not going to have time to do all of that before 1.7 is due, so I'm unaccepting this ticket in case someone else wants to finish what I've started. I implemented 2, but as tahoe censor. I'm attaching that, and the tests I wrote for it to this ticket -- maybe they'll be useful somehow to whoever accepts this ticket. If I do get time, I'll re-accept it and continue working on it.
comment:10 Changed at 2010-05-01T23:50:09Z by kevan
- Owner changed from kevan to somebody
- Status changed from assigned to new
comment:11 Changed at 2010-06-16T04:25:27Z by davidsarah
- Keywords review-needed added
- Milestone changed from 1.7.0 to 1.7.1
comment:12 Changed at 2010-07-11T17:41:57Z by zooko
- Keywords review-needed removed
- Milestone changed from 1.7.1 to undecided
It sounds like from Kevan's comment:9 that he would not recommend committing these patches to Tahoe-LAFS trunk. Therefore I'm unsetting "review-needed".
comment:13 Changed at 2012-02-23T00:35:24Z by davidsarah
- Milestone changed from undecided to soon
comment:14 Changed at 2013-01-14T06:29:15Z by zooko
- Description modified (diff)
- Keywords confidentiality added
- Milestone changed from soon to eventually
- Summary changed from censor introducer furl from log files to add a "censor" command to filter out sensitive information from log files
comment:15 Changed at 2013-01-14T08:02:57Z by zooko
Other potentially sensitive information that shows up in foolscap logs (including incident report files):
- storage server furls
- the exact sizes of files
- the self-chosen nicknames of servers
comment:16 Changed at 2013-01-14T08:18:34Z by zooko
comment:17 Changed at 2013-01-14T09:00:43Z by zooko
- Description modified (diff)
comment:18 Changed at 2013-01-14T09:06:26Z by zooko
- Description modified (diff)
If you like this bug, you might also like #823.