#85 closed enhancement (fixed)
disable asm
Reported by: | zooko | Owned by: | daira |
---|---|---|---|
Priority: | major | Milestone: | 0.7.0 |
Version: | 0.5.29 | Keywords: | design-review-needed |
Cc: | Launchpad Bug: |
Description
When building the bundled copy of Crypto++, we should just disable ASM always. This will result in a performance penalty on platforms that formerly were able to use the asm implementation, but it would also avoid occasional weird bugs (most recently #84) and would make it easier for us (especially me, who doesn't really know ASM) to understand what it is doing.
Change History (30)
comment:1 Changed at 2013-04-09T22:00:52Z by sneves
comment:2 Changed at 2013-04-10T02:22:17Z by daira
Where is the code that selects AES-NI? I don't see it in any obvious place under https://tahoe-lafs.org/trac/pycryptopp/browser/git/src-cryptopp (rijndael.cpp/h, aes.h, cpu.cpp).
comment:3 Changed at 2013-04-10T02:27:19Z by sneves
The bits of rijndael.cpp that use AES-NI are guarded by CRYPTOPP_BOOL_AESNI_INTRINSICS_AVAILABLE. This macro itself is defined in config.h, line 293.
comment:4 Changed at 2013-04-11T06:04:16Z by zooko
See also #76.
comment:5 Changed at 2013-04-11T07:27:44Z by zooko
I pushed a patch that defined CRYPTOPP_DISABLE_ASM to the debug-netbsd-rsa-2 branch:
https://github.com/zooko/pycryptopp/commit/2a38bfd4b4ab714d5f624a4e505993b6aec4418b
And triggered a build of it on all builders:
https://tahoe-lafs.org/buildbot-pycryptopp/waterfall?last_time=1365665120
As well as telling us if this changes any behavior of unit tests, this also will show the difference in benchmarks between the previous version and the CRYPTOPP_DISABLE_ASM version. (To see this you have to manually click and visually compare the previous and current benchmarks on each builder.)
comment:6 Changed at 2013-04-12T08:28:40Z by zooko
Hm... So with CRYPTOPP_DISABLE_ASM, the quick self-test of SHA256 segfaults at startup, on the "atlas" ubuntu box and on midnightmagic's netbsd box:
But only when they are linking against the system-provided libcryptopp instead of the embedded cryptopp. I guess it is really necessary that the CRYPTOPP_DISABLE_ASM macro be set the same when we build pycryptopp as when the libcryptopp.so that we're linking against was built. It is a safe bet (I guess) that it is never set for any system-provided libcryptopp.so, so I'll just make it so that our setup.py sets CRYPTOPP_DISABLE_ASM only when it is building the embedded Crypto++.
comment:7 Changed at 2013-04-12T08:33:26Z by zooko
Okay, I committed that patch to https://github.com/zooko/pycryptopp/commits/85_disable_asm and triggered builds: https://tahoe-lafs.org/buildbot-pycryptopp/waterfall?last_time=1365755564
comment:8 Changed at 2013-04-12T08:59:57Z by zooko
Posted about this to tahoe-dev: https://tahoe-lafs.org/pipermail/tahoe-dev/2013-April/008164.html
comment:9 Changed at 2013-04-12T09:03:45Z by zooko
- Keywords review-needed added
Someone please review this branch!
comment:10 Changed at 2013-04-12T16:47:25Z by zooko
I posted an update on the Crypto++ mailing list: https://groups.google.com/forum/?fromgroups=#!topic/cryptopp-users/qGIdqp3MIgg
#85 appears to be on the way to fixing this bug, but it is somewhat dissatisfying since I don't know _exactly_ why #85 fixes it.
comment:11 Changed at 2013-04-12T19:20:58Z by daira
- Owner set to daira
- Status changed from new to assigned
I will review the branch.
comment:12 Changed at 2013-04-12T19:31:14Z by daira
- Keywords review-needed removed
- Owner changed from daira to zooko
- Status changed from assigned to new
Reviewed; needs resolution of this comment.
comment:13 Changed at 2013-04-13T03:35:05Z by zooko
Brian: please design-review the design proposed in the comments: https://github.com/zooko/pycryptopp/commit/5c2e0f729a25eee8f374ff79a9299a818c378bc2#commitcomment-3005081
comment:14 Changed at 2013-04-13T03:35:31Z by zooko
- Keywords design-review-needed added
- Owner changed from zooko to warner
comment:15 Changed at 2013-04-14T22:56:18Z by sneves
- Latest patch (https://github.com/zooko/pycryptopp/commits/85_disable_asm), is setting the (now unsupported) PYCRYPTOPP_DISABLE_EMBEDDED_CRYPTOPP environment var in pycryptopp.egg-info/stdeb.cfg
- In setup.py:35, 'supported' is misspelled.
comment:16 Changed at 2013-04-19T22:02:18Z by zooko
Thank you very much for the review. Committed it to https://github.com/zooko/pycryptopp master:
- https://github.com/zooko/pycryptopp/commit/345be57fe19c65f239b0d25b48b640f70d3d0b33
- https://github.com/zooko/pycryptopp/commit/c20ccaaf1d395712f1d9a8eb0c58ff5afba5100e
- https://github.com/zooko/pycryptopp/commit/f42c836b76b9bbd3eee3f011a46dda7708079199
- https://github.com/zooko/pycryptopp/commit/f42c836b76b9bbd3eee3f011a46dda7708079199
- https://github.com/zooko/pycryptopp/commit/75f2898907becffda46d5ec07dad31013ce7cb4a
There's an open pull request to https://github.com/tahoe-lafs/pycryptopp master, which I'm hoping will stimulate Brian to review these patches.
comment:17 Changed at 2013-04-19T22:02:27Z by zooko
- Resolution set to fixed
- Status changed from new to closed
comment:18 Changed at 2013-07-17T16:29:02Z by zooko
- Resolution fixed deleted
- Status changed from closed to reopened
This hasn't been applied to trunk yet.
comment:19 Changed at 2013-07-17T16:29:12Z by zooko
- Status changed from reopened to new
comment:20 Changed at 2014-07-22T19:22:15Z by zooko
We had already agreed to disable assembly optimizations in pycryptopp, because there seem to have been a lot of bugs in the optimized assembly code in the past, and because the added speed really makes no difference to our uses, as far as I know.
However, in order to explain and justify to other people (e.g. Debian packagers) why we are doing this, and why they should consider doing the same thing themselves, I just read through the entire history of issues in pycryptopp and classified whether they were runtime errors (and therefore potential security bugs) or build-time errors (therefore probably not), and whether they would have been avoided if we had been disabling assembly optimizations all along. Here are the results. They clearly show that we should disable the optimized assembly! About half of all the security-threatening bugs we've had would never have been an issue if we'd avoided assembly from the beginning.
By the way, in my opinion the author of Crypto++, Wei Dai, is an exceptionally skilled, careful, and experienced coder, and I would assume that if Crypto++ has had this many security-threatening bugs in its optimized assembly code, then other crypto libraries that also use optimized assembly code have also had at least as many.
bugs that cause run-time failures
(These bugs are potential security issues.)
- would have been avoided by DISABLE_ASM:
- unclear if it would have been avoided if we'd used DISABLE_ASM:
- would not have been avoided by DISABLE_ASM (but would have been avoided by using cffi instead of CPython API)
- would have been avoided if we *didn't* use DISABLE_ASM! (A bug only in the non-ASM version!)
bugs that cause deterministic build or compilation failures
(These bugs are *typically* not potential security issues but they can be, and in any case they are engineering/deployment issues.)
comment:21 follow-up: ↓ 23 Changed at 2014-10-02T01:12:47Z by daira
What is the status of this ticket and https://github.com/tahoe-lafs/pycryptopp/pull/23 ?
comment:22 Changed at 2014-10-02T01:17:47Z by daira
- Owner changed from warner to zooko
comment:23 in reply to: ↑ 21 Changed at 2014-10-03T15:35:09Z by zooko
Replying to daira:
What is the status of this ticket and https://github.com/tahoe-lafs/pycryptopp/pull/23 ?
I'm confused. I apparently thought that the patch had been merged into pycryptopp trunk. Obviously, it hasn't.
comment:24 Changed at 2014-10-17T10:12:57Z by daira
Let's fix this before the Tahoe-LAFS 1.11 1.10.1 release. It's needed for Tahoe-LAFS to work on FreeBSD (https://tahoe-lafs.org/trac/tahoe-lafs/ticket/2269), and it's a clear auditability improvement.
comment:25 Changed at 2014-12-01T16:31:11Z by zooko
- Owner changed from zooko to daira
I agree we should fix this right away. What's the next step?
comment:26 Changed at 2014-12-01T22:53:24Z by daira
I suspect it is to separate out the changes that are relevant to this ticket from https://github.com/tahoe-lafs/pycryptopp/pull/23 , and file another pull request.
comment:27 Changed at 2015-04-14T07:01:37Z by daira
See https://github.com/tahoe-lafs/pycryptopp/pull/26, although it needs more work.
comment:28 Changed at 2016-01-03T22:27:40Z by zooko
We applied the first two patches from https://github.com/tahoe-lafs/pycryptopp/pull/26 (https://github.com/twopir/pycryptopp/commit/b4cb7086cba8ebfdc90fe63b1da879a7ce84c03f and https://github.com/twopir/pycryptopp/commit/cd7a209d10992b87bbbfd70846ee561b6bb218b9), asked Jeromy "whyrusleeping" Johnson to manually verify that this made the build go from failing to working on his Mac OS X 10.10.5 (see https://github.com/tahoe-lafs/pycryptopp/pull/27), and then ran benchmarks before and after the patch on my Google Chromebook Pixel 1 (Intel(R) Core(TM) i5-3427U CPU @ 1.80GHz). The benchmarks (shown below) showed as expected: no change for ECDSA, Ed25519, RSA, or hashlib, and small change in performance for AES, XSalsa20, and SHA256.
benchmarks of current trunk with asm enabled:
zooko@spark ~/playground/pycryptopp $ time python setup.py bench running bench <class 'pycryptopp.bench.bench_sigs.ECDSA256'> generate key best: 4.350e-02, 1th-best: 4.350e-02, mean: 4.350e-02, 1th-worst: 4.350e-02, worst: 4.350e-02 (of 1) sign best: 9.089e-01, 1th-best: 9.089e-01, mean: 9.089e-01, 1th-worst: 9.089e-01, worst: 9.089e-01 (of 1) verify best: 3.479e+00, 1th-best: 3.479e+00, mean: 3.479e+00, 1th-worst: 3.479e+00, worst: 3.479e+00 (of 1) <class 'pycryptopp.bench.bench_sigs.Ed25519'> generate key best: 1.825e+00, 1th-best: 1.825e+00, mean: 1.825e+00, 1th-worst: 1.825e+00, worst: 1.825e+00 (of 1) sign best: 1.830e+00, 1th-best: 1.830e+00, mean: 1.830e+00, 1th-worst: 1.830e+00, worst: 1.830e+00 (of 1) verify best: 6.249e+00, 1th-best: 6.249e+00, mean: 6.249e+00, 1th-worst: 6.249e+00, worst: 6.249e+00 (of 1) <class 'pycryptopp.bench.bench_sigs.RSA2048'> generate key best: 9.093e+01, 1th-best: 9.093e+01, mean: 9.093e+01, 1th-worst: 9.093e+01, worst: 9.093e+01 (of 1) sign best: 1.963e+00, 1th-best: 1.963e+00, mean: 1.963e+00, 1th-worst: 1.963e+00, worst: 1.963e+00 (of 1) verify best: 4.617e-02, 1th-best: 4.617e-02, mean: 4.617e-02, 1th-worst: 4.617e-02, worst: 4.617e-02 (of 1) <class 'pycryptopp.bench.bench_sigs.RSA3248'> generate key best: 5.065e+02, 1th-best: 5.065e+02, mean: 5.065e+02, 1th-worst: 5.065e+02, worst: 5.065e+02 (of 1) sign best: 1.006e+01, 1th-best: 1.006e+01, mean: 1.006e+01, 1th-worst: 1.006e+01, worst: 1.006e+01 (of 1) verify best: 1.068e-01, 1th-best: 1.068e-01, mean: 1.068e-01, 1th-worst: 1.068e-01, worst: 1.068e-01 (of 1) milliseconds per operation <AES-128> large (10000000 B) best: 3.982e+00, 1th-best: 3.982e+00, mean: 3.982e+00, 1th-worst: 3.982e+00, worst: 3.982e+00 (of 1) <AES-256> large (10000000 B) best: 5.422e+00, 1th-best: 5.422e+00, mean: 5.422e+00, 1th-worst: 5.422e+00, worst: 5.422e+00 (of 1) <XSalsa20-256> large (10000000 B) best: 1.387e+00, 1th-best: 1.387e+00, mean: 1.387e+00, 1th-worst: 1.387e+00, worst: 1.387e+00 (of 1) nanoseconds per byte crypted <class 'pycryptopp.bench.bench_hashes.SHA256'> large (10000000 B) best: 5.136e+00, 1th-best: 5.136e+00, mean: 5.136e+00, 1th-worst: 5.136e+00, worst: 5.136e+00 (of 1) <class 'pycryptopp.bench.bench_hashes.hashlibSHA256'> large (10000000 B) best: 4.798e+00, 1th-best: 4.798e+00, mean: 4.798e+00, 1th-worst: 4.798e+00, worst: 4.798e+00 (of 1) nanoseconds per byte hashed
benchmarks with these two patches that disable asm:
running bench <class 'pycryptopp.bench.bench_sigs.ECDSA256'> generate key best: 4.385e-02, 1th-best: 4.385e-02, mean: 4.385e-02, 1th-worst: 4.385e-02, worst: 4.385e-02 (of 1) sign best: 9.948e-01, 1th-best: 9.948e-01, mean: 9.948e-01, 1th-worst: 9.948e-01, worst: 9.948e-01 (of 1) verify best: 3.936e+00, 1th-best: 3.936e+00, mean: 3.936e+00, 1th-worst: 3.936e+00, worst: 3.936e+00 (of 1) <class 'pycryptopp.bench.bench_sigs.Ed25519'> generate key best: 1.818e+00, 1th-best: 1.818e+00, mean: 1.818e+00, 1th-worst: 1.818e+00, worst: 1.818e+00 (of 1) sign best: 1.823e+00, 1th-best: 1.823e+00, mean: 1.823e+00, 1th-worst: 1.823e+00, worst: 1.823e+00 (of 1) verify best: 6.178e+00, 1th-best: 6.178e+00, mean: 6.178e+00, 1th-worst: 6.178e+00, worst: 6.178e+00 (of 1) <class 'pycryptopp.bench.bench_sigs.RSA2048'> generate key best: 6.325e+01, 1th-best: 6.325e+01, mean: 6.325e+01, 1th-worst: 6.325e+01, worst: 6.325e+01 (of 1) sign best: 2.136e+00, 1th-best: 2.136e+00, mean: 2.136e+00, 1th-worst: 2.136e+00, worst: 2.136e+00 (of 1) verify best: 4.965e-02, 1th-best: 4.965e-02, mean: 4.965e-02, 1th-worst: 4.965e-02, worst: 4.965e-02 (of 1) <class 'pycryptopp.bench.bench_sigs.RSA3248'> generate key best: 7.379e+02, 1th-best: 7.379e+02, mean: 7.379e+02, 1th-worst: 7.379e+02, worst: 7.379e+02 (of 1) sign best: 1.139e+01, 1th-best: 1.139e+01, mean: 1.139e+01, 1th-worst: 1.139e+01, worst: 1.139e+01 (of 1) verify best: 1.213e-01, 1th-best: 1.213e-01, mean: 1.213e-01, 1th-worst: 1.213e-01, worst: 1.213e-01 (of 1) milliseconds per operation <AES-128> large (10000000 B) best: 7.006e+00, 1th-best: 7.006e+00, mean: 7.006e+00, 1th-worst: 7.006e+00, worst: 7.006e+00 (of 1) <AES-256> large (10000000 B) best: 9.376e+00, 1th-best: 9.376e+00, mean: 9.376e+00, 1th-worst: 9.376e+00, worst: 9.376e+00 (of 1) <XSalsa20-256> large (10000000 B) best: 5.388e+00, 1th-best: 5.388e+00, mean: 5.388e+00, 1th-worst: 5.388e+00, worst: 5.388e+00 (of 1) nanoseconds per byte crypted <class 'pycryptopp.bench.bench_hashes.SHA256'> large (10000000 B) best: 6.213e+00, 1th-best: 6.213e+00, mean: 6.213e+00, 1th-worst: 6.213e+00, worst: 6.213e+00 (of 1) <class 'pycryptopp.bench.bench_hashes.hashlibSHA256'> large (10000000 B) best: 4.839e+00, 1th-best: 4.839e+00, mean: 4.839e+00, 1th-worst: 4.839e+00, worst: 4.839e+00 (of 1) nanoseconds per byte hashed
comment:29 Changed at 2016-01-03T22:53:05Z by zooko
- Resolution set to fixed
- Status changed from new to closed
comment:30 Changed at 2016-01-14T17:52:11Z by daira
- Milestone set to 0.7.0
Unless Crypto++ speed is a bottleneck of pycryptopp's applications, I see nothing wrong with this.
The main loss here seems to be AES-NI support, which increases AES speed by quite a lot.