Opened at 2008-01-08T23:49:41Z
Last modified at 2019-07-25T13:58:02Z
#266 new enhancement
when cryptography has random-access AES, update helper to use it
Reported by: | warner | Owned by: | warner |
---|---|---|---|
Priority: | minor | Milestone: | eventually |
Component: | code-encoding | Version: | 0.7.0 |
Keywords: | upload-helper pycryptopp performance random-access | Cc: | |
Launchpad Bug: |
Description (last modified by zooko)
to build the offloaded-uploader (#116), we need the ability to do AES CTR mode at arbitrary places in the input stream. I think a second (optional) argument to aes.AES.process() that accepts a byte offset would do the trick. I'm imagining something like:
def process(self, data, offset=None): if offset is None: offset = self._last_offset block = aes_encrypt(self._key, number_as_string(offset)) output = xor(data, block) self._last_offset = offset + len(data) return output
(with all the appropriate joy of handling block boundaries, of course)
Change History (14)
comment:1 Changed at 2008-01-09T01:14:34Z by warner
- Description modified (diff)
- Priority changed from major to minor
comment:2 Changed at 2008-01-23T02:22:27Z by zooko
- Milestone changed from 0.8.0 (Allmydata 3.0 Beta) to undecided
comment:3 Changed at 2008-01-24T02:00:31Z by warner
Note: once pycryptopp can do this, revisit upload.py EncryptoAnUploadable.read_encrypted, to allow it to avoid encrypting all the data that we're going to throw away (when hash_only==True).
comment:4 Changed at 2008-02-14T00:02:03Z by warner
it sounds like zooko just added this feature to pycryptopp-0.3.0, right?
comment:5 Changed at 2008-02-14T03:12:07Z by zooko
Yes. You can pass an "iv" argument to the constructor of an AES object now.
comment:6 Changed at 2008-04-25T00:00:30Z by warner
- Component changed from code to code-performance
- Keywords helper added
- Summary changed from pycryptopp: we need random-access AES encryption to update helper to take advantage of random-access AES encryption
ok, changing the ticket description to reflect that this is now an action item for the helper, rather than pycryptopp
comment:7 Changed at 2008-04-25T00:00:45Z by warner
- Owner changed from zooko to warner
comment:8 Changed at 2009-04-15T19:18:31Z by warner
hrm. Tell me about this "iv" argument: is it a string or an integer? I think I'd rather have pycryptopp be responsible for the pack-int-to-string and block-alignment issues.. it is more likely to get them right than I would.
It would also be nice to be able to pass an offset into process(), so you could create one AES object and then seek around with it (rather than creating a new AES object for every segment), but if that were a hassle I suppose I could live without it.
What happens if my segment starts on a non-AES-blocksize boundary?
Incidentally, I added pycryptopp#18 to request this feature, having forgotten about this #266 ticket.
comment:9 Changed at 2009-08-27T06:21:17Z by warner
- Summary changed from update helper to take advantage of random-access AES encryption to when pycryptopp has random-access AES, update helper to use it
As I understand it, pycryptopp's iv= argument is *not* sufficient for this, at least not without a huge amount of work on the tahoe side. iv= is a string, which defaults to an AES-sized block of all zeros. To process data at an arbitrary location, the application must figure out the right offset, pack that value into the IV block, handle non-multiple-of-the-blocksize shifts, and then keep track of exactly how much data you pass into process() if you want to call it more than once. Then, to actually seek() to a different location, you have to throw away that AES object and make a new one, since the AES() constructor is the only place that takes an iv= argument, not process().
Updating title to reflect this. Note that pycryptopp ticket 18 is still the feature request: this ticket is to remind us (on tahoe) that this sort of feature is blocked on pycryptopp-18. Also note that better random-access downloading (i.e. anything other than starting at the beginning of the file and fetching everything from there to the point of interest) also requires random-access AES.
comment:10 Changed at 2009-12-04T04:54:24Z by davidsarah
- Component changed from code-performance to code-encoding
- Keywords performance added
comment:11 Changed at 2009-12-20T17:33:53Z by davidsarah
- Keywords random-access added
comment:12 Changed at 2015-08-16T15:26:33Z by zooko
- Description modified (diff)
- Keywords upload-helper added; helper removed
comment:13 Changed at 2019-07-25T13:13:50Z by exarkun
- Summary changed from when pycryptopp has random-access AES, update helper to use it to when cryptography has random-access AES, update helper to use it
ticket:3031 switched Tahoe-LAFS from pycryptopp to cryptography but cryptography also doesn't offer a random-access AES so I'm modifying this ticket to be about the same problem but with cryptography.
comment:14 Changed at 2019-07-25T13:58:02Z by exarkun
I filed ticket:3230 for considering an alternate solution. From #cryptography-dev there was neither a clear yes or no on whether providing a simpler API in cryptography would be sensible. The use case is slightly obscure and the functionality is already available, just with a slightly inconvenient API. However, OpenStack? Swift does basically the same thing as Tahoe-LAFS: https://github.com/openstack/swift/blob/ae36030278b244148363d4a1c6ed0c7fc7cb5204/swift/common/middleware/crypto/crypto_utils.py#L76-L84
The fact that there are two applications seemed to carry some weight.
hmm, robk astutely points out that we might want to re-encrypt the data on the second pass anyways, to build up the hash trees. So we can build the offloaded uploader without this. Lowering the priority.