#990 closed defect (fixed)

Web gateway should avoid caching plaintext of downloads

Reported by: jsgf Owned by: nobody
Priority: major Milestone: 1.8.0
Component: code-frontend Version: 1.6.0
Keywords: confidentiality download cache webapi Cc:
Launchpad Bug:

Description

The web gateway will (on occasion) locally cache files in unencrypted form, such as handling ranged GET requests.

Now in normal use that's perfectly OK because web gateways are trusted with our unencrypted data and so having the data present in that form should be OK.

But my mental model of a gateway machine is that it's just a stateless waypoint which doesn't store anything local. If I have a setup where there's a gateway machine within my network serving several machines, I would expect it to not have any persistent memory of my data, and so when it comes time to replace the HD I don't need to worry about scrubbing the disk, at least for Tahoe's sake. (Let's assume swap has been dealt with.)

Therefore, I think the gateway should keep the cache files encrypted and only decrypt them on the fly as they're being sent to its clients. I'm not sure what the key should be, but it should be per-file and transient (derived from the cap/root hash/something else?) rather than some local state (which would defeat the purpose of encrypting in the first place).

Could possibly be handled as part of the downloader rewrite of #798?

Change History (18)

comment:1 Changed at 2010-03-11T19:40:38Z by jsgf

  • Keywords confidentiality download cache added

comment:2 follow-up: Changed at 2010-03-11T22:45:16Z by davidsarah

  • Component changed from unknown to code-frontend
  • Keywords webapi sftp ftpd added
  • Milestone changed from undecided to 1.7.0
  • Summary changed from Web gateway should keep its caches encrypted to Web gateway should avoid caching plaintext

There are three options that would fix this:

  1. Use encrypted temporary files as suggested above
  2. Stop using temporary files
  3. Securely overwrite temporary files before closing them

However 2. may require too much memory, and 3. may leave plaintext accessible if the gateway crashes.

Note that the SFTP and FTP frontends also use unencrypted temporary files to handle write requests (perhaps that should be split into a separate ticket, but this one will do for the time being).

#991 was a duplicate.

comment:3 in reply to: ↑ 2 Changed at 2010-03-11T22:49:45Z by davidsarah

Replying to davidsarah:

  1. Stop using temporary files

...

However 2. may require too much memory, ...

More precisely, too much address space on 32-bit machines. How much virtual memory would be used by this option, would not be a problem per se if we had unlimited address space and an OS with a well-designed virtual memory subsystem (but we don't).

comment:4 Changed at 2010-03-11T23:00:56Z by jsgf

I think the downloader's use of temp files is fairly unfortunate (downloading a 200MB file to satisfy a small range), so option 2 isn't out of the question. (There may be some value in caching for performance reasons, but that isn't why the downloader caches now.)

Secure deletion is pretty much impossible unless the filesystem and storage subsystem supports it. It certainly isn't sufficient to just overwrite the file and hope that hits all same blocks the original data hit. So I think 3 is out.

comment:5 follow-up: Changed at 2010-03-11T23:04:13Z by zooko

My favorite solution to this would be to implement #320 (add streaming (on-line) upload to HTTP interface) so that the gateway doesn't use the disk at all. #320 would offer great improvements, IMO, in performance and flexibility.

You have to give up on convergent encryption whenever you choose streaming upload (although I wonder if we could get some of it back by defining an encryption key from the secure hash of each segment in turn (including the added convergence secret) and using that key to encrypt the next segment..).

comment:6 in reply to: ↑ 5 ; follow-up: Changed at 2010-03-11T23:07:03Z by jsgf

Replying to zooko:

My favorite solution to this would be to implement #320 (add streaming (on-line) upload to HTTP interface) so that the gateway doesn't use the disk at all. #320 would offer great improvements, IMO, in performance and flexibility.

However, in this case we're talking about cache files for *downloading*. When you do a byte-range GET request, you lose streaming downloads.

comment:7 Changed at 2010-03-11T23:08:10Z by jsgf

  • Summary changed from Web gateway should avoid caching plaintext to Web gateway should avoid caching plaintext of downloads

comment:8 Changed at 2010-03-11T23:20:52Z by jack.lloyd

As a workaround of sorts, you could set the tempdir to point to an encrypted partition (or, on Linux, a tmpfs (an in-memory filesystem) backed by encrypted swap). And using encrypted swap is generally desirable for a host of other reasons; it's a shame nobody but OpenBSD (IIRC) actually does it by default.

Obviously this is impractical in many situations and undesirable in others, but I thought it would be good to point it out, for the subset of users for whom it might be useful.

I think option (3) is a non-starter - dealing with journaling filesystems, SSDs, etc make this nearly impossible in the general case. You can probably get it to work for, say, 9 in 10 users, but leaving 1 in 10 silently vulnerable is not good, and I'd think the effort doing this would be much better spent on eliminating temp files where possible or encrypting them if it's not feasible.

comment:9 in reply to: ↑ 6 ; follow-up: Changed at 2010-03-12T00:15:29Z by zooko

Replying to jsgf:

Replying to zooko:

My favorite solution to this would be to implement #320 (add streaming (on-line) upload to HTTP interface) so that the gateway doesn't use the disk at all. #320 would offer great improvements, IMO, in performance and flexibility.

However, in this case we're talking about cache files for *downloading*. When you do a byte-range GET request, you lose streaming downloads.

Oh, right, *downloads* when you are doing a range-request. My favorite solution to that is #798 (improve random-access download to retrieve/decrypt less data) so that the web gateway downloads only the segments needed to satisfy your range request and just keeps them in RAM until it has satsified you.

comment:10 in reply to: ↑ 9 Changed at 2010-03-12T00:52:03Z by jsgf

Replying to zooko:

Oh, right, *downloads* when you are doing a range-request. My favorite solution to that is #798 (improve random-access download to retrieve/decrypt less data) so that the web gateway downloads only the segments needed to satisfy your range request and just keeps them in RAM until it has satsified you.

Yes, that's why I mentioned #798 in the report as possibly the best way of solving the problem ;)

comment:11 Changed at 2010-03-14T03:58:39Z by warner

yeah, the new downloader won't touch the disk at all. It fetches exactly the segment required to satisfy the first part of the range request, delivers the plaintext, then forgets about that segment and moves on to the next one.

comment:12 follow-up: Changed at 2010-05-16T03:21:16Z by davidsarah

The new SFTP implementation in #1037 uses an encrypted temp file for uploads. So that will be fixed as well in 1.7.

comment:13 in reply to: ↑ 12 Changed at 2010-05-16T03:23:49Z by davidsarah

  • Milestone changed from 1.7.0 to 1.8.0

Replying to davidsarah:

The new SFTP implementation in #1037 uses an encrypted temp file for uploads. So that will be fixed as well in 1.7.

Not "as well", because the new downloader has been deferred to 1.8.

comment:14 Changed at 2010-06-13T03:08:56Z by davidsarah

  • Keywords sftp removed

comment:15 Changed at 2010-07-11T22:04:25Z by davidsarah

  • Keywords ftpd removed

comment:16 follow-up: Changed at 2010-08-14T18:08:38Z by warner

The #798 new immutable downloader has landed, and does not touch the disk. (the mutable downloader doesn't touch the disk either). The webapi interface to it uses the correct read(consumer,offset,length) interface. The upload-side webapi server will still put large (>100kB) plaintext files on disk (in an anonymous tempfile), and I don't know what the FTP/SFTP code does.

Is this ticket narrowly-scoped enough that we can now close it?

comment:17 in reply to: ↑ 16 ; follow-up: Changed at 2010-08-14T20:47:03Z by davidsarah

  • Resolution set to fixed
  • Status changed from new to closed

Replying to warner:

The #798 new immutable downloader has landed, and does not touch the disk. (the mutable downloader doesn't touch the disk either). The webapi interface to it uses the correct read(consumer,offset,length) interface. The upload-side webapi server will still put large (>100kB) plaintext files on disk (in an anonymous tempfile),

Perhaps it should be using EncryptedTemporaryFile? That would be a new ticket, though.

and I don't know what the FTP/SFTP code does.

Both the FTP and SFTP code only do full downloads. They sometimes use EncryptedTemporaryFiles, but don't store plaintext on disk (see 05022dca36780b3b).

Is this ticket narrowly-scoped enough that we can now close it?

Yes.

comment:18 in reply to: ↑ 17 Changed at 2010-08-21T03:17:22Z by davidsarah

Replying to davidsarah:

Replying to warner:

The upload-side webapi server will still put large (>100kB) plaintext files on disk (in an anonymous tempfile),

Perhaps it should be using EncryptedTemporaryFile? That would be a new ticket, though.

This is #1176.

Note: See TracTickets for help on using tickets.