Opened at 2011-10-25T11:02:37Z
Last modified at 2020-10-30T12:35:44Z
#1570 closed defect
S3 backend: support streaming writes to immutable shares — at Initial Version
Reported by: | davidsarah | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | undecided |
Component: | code-storage | Version: | 1.9.0b1 |
Keywords: | security anti-censorship streaming performance memory s3 cloud-backend storage | Cc: | |
Launchpad Bug: |
Description
For immutable shares, the current S3 backend implementation writes data to a StringIO on each call to write_share_data (corresponding to remote_write), and only PUTs the share to S3 when the share is closed (corresponding to remote_close). This increases latency, and requires memory usage in the storage server proportional to the size of the share.
Sending the data over HTTP as it is received by calls to remote_write is not difficult in principle, although it isn't currently supported by txaws. However, how would we calculate the MD5 hash of the share file? The Content-MD5 header is optional, but without it there is an easy replay attack against the S3 authentication, allowing an attacker to replace share files without knowing the S3 secret key. (They cannot break integrity, but they can arbitrarily delete data.) A possible workaround is to require TLS for PUTs to S3, so that the Authentication header is secret and not available for replay. Note that this doesn't require the storage server to be authenticated at the TLS level, but it would require validating Amazon's certificate (or that of an S3 lookalike services) in order to prevent MITM attacks.