Opened at 2010-08-09T00:11:18Z
Last modified at 2013-09-11T03:41:35Z
#46 new enhancement
Add combined AES+XSalsa20 cipher module
Reported by: | randombit | Owned by: | from_pycon |
---|---|---|---|
Priority: | major | Milestone: | 0.7.0 |
Version: | 0.5.19 | Keywords: | xsalsa20 aes combiner design-review-needed |
Cc: | lloyd@…, davidsarah | Launchpad Bug: |
Description (last modified by zooko)
For preserving confidentiality in the event of a break in AES, we want to combine AES (256 bit, CTR mode) with XSalsa20. This will simply process the message with both in sequence; it doesn't matter which order they are applied in, as both are effectively key stream generators, so AES-CTR(XSalsa20(m)) == XSalsa20(AES-CTR(m)).
This requires us to have 512 bits worth of key material, because both AES-256 and XSalsa20 use 256 bit keys, plus 320 bits of initialization vector data (128 for AES and 192 for XSalsa20).
Long keys are problematic for usability reasons (a longer key requires a longer capability string, and 256 bits is about as long as we can reasonably make them), so we'll want to instead derive both AES and XSalsa20 keys from a 256 bit input using a strong KDF. We'll use HKDF for this purpose. Thus, the overall construction that will be exported from pycryptopp will look like this:
AES_plus_XSalsa20(m, masterkey_256, iv):
hkdf = HKDF(masterkey_256)
aes_key_256 = hkdf.make(32)
xsalsa_key_256 = hkdf.make(32)
(aes_iv,xsalsa_iv) = split iv into 128 + 192 bit pieces
aes_encrypted = AES_CTR(m, aes_key_256, aes_iv)
xsalsa_encrypted = XSalsa20(aes_encrypted, xsalsa_key_256, xsalsa_iv)
return xsalsa_encrypted
Practically speaking, it appears that at the moment Tahoe does not use the ability to set an IV except for sequential access into the stream, otherwise always using an IV of all zeros (this is fine because the keys are generated randomly or via content hashing, and thus will always differ, except in the case that you are encrypting identically messages in which case you'll get identical ciphertext, which is a desirable property). We'll have to make some modifications there when it comes time to implement XSalsa20+AES decryption, because XSalsa20's IV is merely a diversification parameter, the counter exists elsewhere in the state (it can be modified in Crypto++ by calling SeekToIteration?).
This is part of the Tahoe-LAFS One Hundred Year Cryptography project.
This is to be used for Tahoe-LAFS ticket https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1164
Attachments (7)
Change History (26)
Changed at 2010-08-09T15:46:34Z by xueyu
Changed at 2010-08-09T16:14:18Z by xueyu
Changed at 2010-08-11T14:25:15Z by xueyu
comment:1 Changed at 2010-08-11T14:32:10Z by xueyu
ciphercomb.cpp, ciphercomb.hpp and ciphercomb.py are cipher combiner c++ implementation and are wrapped with python.
ciphercombiner.py are python implementation
a different between the two is that in c++ version, the key serves as ikm of hkdf, then generate combiner key. In python version, the key can indepently generated.
a little sorry is that it seems that the patch contain some conflicts information when I used "darcs record".
comment:2 Changed at 2010-09-03T18:11:07Z by zooko
Here are some reasons why I might prefer AES-128, copied from IRC. Dcoder is Samuel Neves.
<Dcoder> what's the rational for using AES128 instead of AES256 in the combined cipher thing? [11:26] <Dcoder> i see that the orignal ticket, http://tahoe-lafs.org/trac/pycryptopp/ticket/46, still says AES256 <zooko> Dcoder: I'm concerned about CPU usage on ARM. <zooko> :-/ <zooko> I guess what I *really* want is super optimized ARM implementation of AES and XSalsa20 and SHA-256. But also, XSalsa20⊕AES-128 should be at least as secure as AES-256 [11:36] <zooko> and also, related-key-weirdness, not that I actually care because Tahoe-LAFS definitely doesn't allow related keys, but if AES-256 starts getting a worse reputation than AES-128 because of that then that becomes something we have to explain--why we're not vulnerable to that. <zooko> Dcoder: what do you think? [11:37] <Dcoder> zooko: i do not disagree [11:46] <Dcoder> i was just curious because of the incongruence between tickets
I think the next step on this issue of AES-256 vs. AES-128 is to benchmark XSalsa20⊕AES-128 compared to XSalsa20⊕AES-256 on an ARM system. François Deppierraz has kindly offered us the use of his ARM NAS box for that purpose -- anybody who wants to run benchmarks on it just ask François for ssh access.
comment:3 Changed at 2011-12-03T20:09:07Z by zooko
- Description modified (diff)
comment:4 Changed at 2011-12-04T02:32:24Z by davidsarah
Re comment:2,
- I would not worry at all about related-key attacks. Anything using HKDF to generate keys is not vulnerable to them, period. I take the point about reputation and blow a big >raspberry< at it -- any loss of reputation resulting from related-key attacks would be likely to apply to AES in general, not specifically AES-256.
- The difference between 128-bit and 256-bit keys does actually matter to security (especially "100-year security" against multi-target and low-success-probability attacks). See Bernstein's Understanding brute force paper, which makes a pretty strong argument in favour of 256-bit (or at least longer than 128-bit) keys.
- It might seem unlikely that any brute-force attacks would be feasible given the cascade with XSalsa20, but it's not implausible that the shorter key length, or just as importantly the lower number of rounds, could help an attack that also took advantage of a weakness in XSalsa20 or AES.
- I don't think that the performance difference between 10 and 14 rounds should be a determining factor for this decision.
comment:5 follow-up: ↓ 7 Changed at 2011-12-04T02:44:14Z by davidsarah
I see some issues with the interface in ciphercombiner.py (which, to avoid confusion, is not the same as the interface of the pseudocode in the Description):
- why is the key parameter in the constructor optional?
- what are the setter methods for? They're not documented, and code using them would have to be relying on implementation details of this particular combiner.
- I would prefer a process method that takes a position, i.e. bytes from start of stream, as a parameter. (In contrast, it's not clear at all how the argument to setCombinerIV relates to the position.)
comment:6 Changed at 2011-12-04T02:45:41Z by davidsarah
- Cc davidsarah added
- Keywords aes combiner design-review-needed added
comment:7 in reply to: ↑ 5 Changed at 2011-12-04T02:57:10Z by davidsarah
Replying to davidsarah:
- what are the setter methods for? They're not documented, and code using them would have to be relying on implementation details of this particular combiner.
Hmm, the C++ code doesn't have them. Let's remove them (and add the position argument to both the Python and C++ versions).
comment:8 Changed at 2011-12-09T22:49:55Z by zooko
- Milestone set to 0.6.0
comment:9 Changed at 2012-03-11T00:48:38Z by warner
- Milestone changed from 0.6.0 to 0.7.0
Zooko and I looked at this one.. the IV argument needs work. Tahoe uses AES's IV= argument to perform random-access seeking through the keystream (since tahoe files are encrypted as one big block, but can be retrieved in little pieces, and we don't want to retrieve+decode+decrypt the whole file just for a short segment). Ticket #18 is about adding a cleaner "seek-to-offset" method or argument of some sort to AES. If we landed this combined AES+XSalsa20 as-is, its IV= argument couldn't be used as tahoe expects it, since the XSalsa20 side would interpret it differently. So we should probably:
1: remove the IV= argument from this
2: implement #18, adding a cleaner seek-to-offset API to AES
3: implement the same API here in AES+XSalsa20
4: when tahoe switches to use AES+XSalsa20, also switch to the new API
Also, the patches (at least the .py file I looked at) have whitespace/indentation problems, and are importing an unused comb4p module.. we should probably run pyflakes over them.
Zooko and I agreed to push this out of the 0.6.0 milestone and into 0.7.0.
comment:10 Changed at 2012-03-22T04:36:03Z by from_pycon
Do we need both the C++ version and the Python version?
If yes, it seems like we can use the API of #18 (process(<string>, offset=<random access offset>)) for the Python version, and use the Crypto++ Seek() method for the C++ version.
We are interested in random access in both versions, correct?
comment:11 Changed at 2012-04-17T21:28:29Z by zooko
- Owner changed from dragonxue to from_pycon
from_pycon: sorry for the delay in responding to this.
I think it would be good to maintain both C and Python versions. In the long run we might find a Python implementation that is so efficient that we don't need the C version there (possibly a future version of PyPy?), and in the short run we can use them for comparing results and "executable documentation". Does that sound okay to you?
I agree that the API of #18 for the Python version and Crypto++'s Seek() method for the C++ version sound good.
comment:12 follow-up: ↓ 13 Changed at 2012-08-09T21:05:04Z by zooko
We talked about this feature in the Tahoe-LAFS Weekly Conference Call and I'll post notes soon. This will hopefully go into Tahoe-LAFS v1.11.
Here's the status of this:
- We need to fix the random-access feature as described by warner in comment:9
- A pure-Python implementation would be great but is not strictly required for this.
- A CPython API implementation is required.
- I want benchmarks of XSalsa20⊕AES-128 compared to XSalsa20⊕AES-256. I assume that on a high-powered amd64 machine there will be no real difference (although a measurement of that would be good). I want to know about a cheap, low-power ARM device: is the time to encrypt or decrypt a 1 GB file noticeably different with those two encryption schemes on your cheap low-power ARM device, and is the battery drain noticeably different?
comment:13 in reply to: ↑ 12 Changed at 2012-08-11T04:24:25Z by zooko
Replying to zooko:
- I want benchmarks of XSalsa20⊕AES-128 compared to XSalsa20⊕AES-256.
Here they are. This script measures the time to crypt a 10 MB string and divides by 10,000,000 to get a time per byte.
On a fast amd64 workstation/server:
XSalsa20: 2 nanoseconds per byte crypted
AES-128: 5 nanoseconds per byte crypted
AES-256: 7 nanoseconds per byte crypted
On a weak, low-power, cheap ARM device:
XSalsa20: 138 nanoseconds per byte crypted
AES-128: 264 nanoseconds per byte crypted
AES-256: 339 nanoseconds per byte crypted
So the time to crypt a 1 GB file with XSalsa20⊕AES-128 on that tiny ARM device would be about 400 seconds, and the time to crypt it with XSalsa20⊕AES-256 would be about 475 seconds.
comment:14 Changed at 2012-10-30T14:53:40Z by sickness
pycryptopp-0.6.0.38.1177269928068382358296250900071740393723434250323.post38# python2.6 misc/build_helpers/show-tool-v ersions.py platform: Linux-2.6.32.11-svn52288-ppc-with-debian-6.0.6 machine: ppc linux_distribution: ('debian', '6.0.6', '') python: 2.6.6 (r266:84292, Dec 27 2010, 10:20:06) [GCC 4.4.5] maxunicode: 1114111 locale: LANG= LANGUAGE= LC_CTYPE="C" LC_NUMERIC="C" LC_TIME="C" LC_COLLATE="C" LC_MONETARY="C" LC_MESSAGES="C" LC_PAPER="C" LC_NAME="C" LC_ADDRESS="C" LC_TELEPHONE="C" LC_MEASUREMENT="C" LC_IDENTIFICATION="C" LC_ALL=C filesystem.encoding: ANSI_X3.4-1968 locale.getpreferredencoding: ANSI_X3.4-1968 locale.defaultlocale: (None, None) locale.locale: (None, None) buildbot: buildbot: no such file or directory buildslave: buildslave: no such file or directory g++: g++ (Debian 4.4.5-8) 4.4.5 cryptest: cryptest: no such file or directory git: git: no such file or directory openssl: OpenSSL 0.9.8g 19 Oct 2007 (Library: OpenSSL 0.9.8o 01 Jun 2010) flappclient: Foolscap version: 0.5.0 Twisted version: 10.1.0 valgrind: valgrind: no such file or directory as: GNU assembler (GNU Binutils for Debian) 2.20.1-system.20100303 Copyright 2009 Free Software Foundation, Inc. This program is free software; you may redistribute it under the terms of the GNU General Public License version 3 or later. This program has absolutely no warranty. This assembler was configured for a target of `powerpc-linux-gnu'. setuptools: [distribute 0.6.14 (/usr/lib/python2.6/dist-packages)] pyutil: DistributionNotFound coverage: DistributionNotFound pyflakes: DistributionNotFound Twisted: [Twisted 10.1.0 (/usr/lib/python2.6/dist-packages)] Twisted module: <module 'twisted' from '/usr/lib/python2.6/dist-packages/twisted/__init__.pyc'> Twisted __version__: 10.1.0 TwistedCore: DistributionNotFound TwistedCore module: <module 'twisted.python' from '/usr/lib/python2.6/dist-packages/twisted/python/__init__.pyc'> pyOpenSSL: [pyOpenSSL 0.10 (/usr/lib/pymodules/python2.6)] pyOpenSSL module: <module 'OpenSSL' from '/usr/lib/pymodules/python2.6/OpenSSL/__init__.pyc'> pyOpenSSL __version__: 0.10 pycryptopp: [pycryptopp 0.5.17 (/usr/lib/pymodules/python2.6)] pycryptopp module: <module 'pycryptopp' from '/usr/lib/pymodules/python2.6/pycryptopp/__init__.pyc'> pycryptopp __version__: 0.5.17 crpyto: DistributionNotFound
comment:15 Changed at 2012-10-30T14:58:53Z by sickness
processor : 0 cpu : APM82181 clock : 800.000008MHz revision : 28.130 (pvr 12c4 1c82) bogomips : 1600.00 timebase : 800000008 platform : PowerPC 44x Platform model : amcc,apollo3g Memory : 256 MB
comment:16 Changed at 2012-10-30T21:16:48Z by sickness
so I've tried the gateway and storage server (and also the introducer) all three on the same western digital my book live embedded box, tried to upload a bunch big files with 1:1:1 encoding just to see how it performed, the sum is it upped 2281Mbytes in 49minutes40seconds, so it's less than 1Mbyte/s, for the record, just running the storage server on the same box, with the gateway on another box, averaged 3Mbyte/s, and just copying files with scp on this box averaged 8Mbyte/s, I'll paste here the details of an upload from the gateway status page:
Started: 22:01:26 30-Oct-2012 Storage Index: k6zrpmsfiehlse5x3ozb6mvwnm Helper?: No Total Size: 735098880 Progress (Hash): 100.0% Progress (Ciphertext): 100.0% Progress (Encode+Push): 100.0% Status: Finished Upload Results Shares Pushed: 1 Shares Already Present: 0 Sharemap: 0 -> placed on [qc62lj3v] Servermap: [qc62lj3v] got share: #0 Timings: File Size: 735098880 bytes Total: 14 minutes (867.9kBps) Storage Index: 31 seconds (23.47MBps) [Contacting Helper]: [Helper Already-In-Grid Check]: [Upload Ciphertext To Helper]: () Peer Selection: 48ms Encode And Push: 13 minutes (918.2kBps) Cumulative Encoding: 3 minutes (4.03MBps) Cumulative Pushing: 10 minutes (1.19MBps) Send Hashes And Close: 4.07s [Helper Total]:
comment:17 Changed at 2012-10-30T21:20:10Z by sickness
for the record, this is the version of tahoe-lafs used for this test:
allmydata-tahoe: 1.9.2,
foolscap: 0.6.3,
pycryptopp: 0.5.17,
zfec: 1.4.24,
Twisted: 10.1.0,
Nevow: 0.10.0,
zope.interface: unknown,
python: 2.6.6,
platform: Linux-debian_6.0.6-ppc-32bit_ELF,
pyOpenSSL: 0.10,
simplejson: 2.6.2,
pycrypto: 2.1.0,
pyasn1: unknown,
mock: 0.6.0,
sqlite3: 2.4.1 [sqlite 3.7.3],
setuptools: 0.6c16dev3
comment:18 Changed at 2013-09-11T03:40:49Z by zooko
The motive for this ticket is to implement https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1164 .
comment:19 Changed at 2013-09-11T03:41:35Z by zooko
- Description modified (diff)
cipher of combiner python version