[tahoe-lafs-trac-stream] [tahoe-lafs] #1924: NetBSD < 6.0 /dev/random appears to break RSA keygen in test suites

tahoe-lafs trac at tahoe-lafs.org
Mon Mar 11 04:57:26 UTC 2013


#1924: NetBSD < 6.0 /dev/random appears to break RSA keygen in test suites
-------------------------------+------------------------------------
     Reporter:  midnightmagic  |      Owner:
         Type:  defect         |     Status:  new
     Priority:  major          |  Milestone:  undecided
    Component:  code           |    Version:  1.9.2
   Resolution:                 |   Keywords:  netbsd random cryptopp
Launchpad Bug:                 |
-------------------------------+------------------------------------

Comment (by zooko):

 So the way this bug manifests is that Crypto++ gets an internal
 inconsistency, saying {{{"InvertibleRSAFunction: computational error
 during private key operation"}}}. The only place in Crypto++ which emits
 that exception message is [//trac/pycryptopp/browser/git/src-
 cryptopp/rsa.cpp?annotate=blame&rev=9c884d4ea2c75bc47dc49d4c404bfc5a9fc3b437#L223].

 This is a self-test which Crypto++ always does internally, checking that
 if it has computed {{{y}}} to be the "RSA inverse" of {{{x}}} {{{mod N}}}
 that then {{{yᵉ = x mod N}}}. This internal consistency check fails
 frequently on midnightmagic's NetBSD 5 machine when the "entropy level" of
 /dev/urandom is drained.

 Now, the thing about this is that the blocking or non-blocking or delaying
 behavior of /dev/urandom //cannot// be a legitimate excuse for this
 internal check to fail! There has to be a bug, either in Crypto++, in the
 compiler, or in the kernel, in order to let this inconsistency happen.

 Here's a typical example of the error, on midnightmagic's NetBSD 5
 buildslave:

 https://tahoe-lafs.org/buildbot-
 pycryptopp/builders/MM%20netbsd5%20i386%20warp/builds/146/steps/bench/logs/stdio

 In case that is no longer available, here is a copy of it for posterity:

 {{{
  (view as text)

 python setup.py bench
  in dir /home/pycryptopp/buildslave/pycryptopp/MM_netbsd5_i386_warp/build
 (timeout 14400 secs)
  watching logfiles {}
  argv: ['python', 'setup.py', 'bench']
  environment:
   EDITOR=joe
   ENV=/home/pycryptopp/.shrc
   EXINIT=set autoindent
   HISTFILESIZE=100000
   HISTSIZE=100000
   HOME=/home/pycryptopp
   LESS=-X
   LOGNAME=pycryptopp
   OLDPWD=/home/pycryptopp
   PAGER=more
 PATH=/home/pycryptopp/bin:/bin:/sbin:/usr/bin:/usr/sbin:/usr/X11R7/bin:/usr/X11R6/bin:/usr/pkg/bin:/usr/pkg/sbin:/usr/games:/usr/local/bin:/usr/local/sbin
   PWD=/home/pycryptopp/buildslave/pycryptopp/MM_netbsd5_i386_warp/build
   PYTHONPATH=/home/pycryptopp/lib/python2.6/site-packages
   SHELL=/usr/pkg/bin/bash
   SHLVL=1
   SU_FROM=root
   TERM=xterm
   USER=pycryptopp
   _=/home/pycryptopp/bin/buildslave
  using PTY: False
 running bench
 terminate called after throwing an instance of 'CryptoPP::Exception'
   what():  InvertibleRSAFunction: computational error during private key
 operation
 <class 'pycryptopp.bench.bench_sigs.ECDSA256'>
 generate key
 best: 1.194e-01,   3th-best: 1.195e-01, mean: 1.196e-01,   3th-worst:
 1.196e-01, worst: 1.198e-01 (of      9)
 sign
 best: 3.679e+00,   1th-best: 3.679e+00, mean: 3.679e+00,   1th-worst:
 3.679e+00, worst: 3.679e+00 (of      1)
 verify
 best: 1.282e+01,   1th-best: 1.282e+01, mean: 1.282e+01,   1th-worst:
 1.282e+01, worst: 1.282e+01 (of      1)

 <class 'pycryptopp.bench.bench_sigs.Ed25519'>
 generate key
 best: 3.968e+00,   1th-best: 3.968e+00, mean: 3.968e+00,   1th-worst:
 3.968e+00, worst: 3.968e+00 (of      1)
 sign
 best: 4.589e+00,   1th-best: 4.589e+00, mean: 4.589e+00,   1th-worst:
 4.589e+00, worst: 4.589e+00 (of      1)
 verify
 best: 1.221e+01,   1th-best: 1.221e+01, mean: 1.221e+01,   1th-worst:
 1.221e+01, worst: 1.221e+01 (of      1)

 <class 'pycryptopp.bench.bench_sigs.RSA2048'>
 generate key
 best: 5.225e+02,   1th-best: 5.225e+02, mean: 5.537e+02,   1th-worst:
 5.848e+02, worst: 5.848e+02 (of      2)
 sign
 process killed by signal 6
 program finished with exit code -1
 elapsedTime=45.001851
 }}}

 (By the way, the fact that Crypto++ does this sort of internal self-check
 unconditonally (i.e., not only when built in some sort of "debug mode") is
 an example of the kind of careful cryptographic engineering which I
 appreciate about Crypto++.)

 Midnightmagic upgraded various components of his computer until he had
 eventually replaced every single component of the computer and the error
 behavior never changed, so it can't be an actual hardware error. (Also, it
 would be a suspiciously specific sort of behavior for a hardware error.)

 I wrote this patch which removes use of the operating system's random
 number generator and instead hardcodes a seed so that the RNG generates
 the same sequence each time:

 https://github.com/zooko/pycryptopp/commits/debug-netbsd-rsa

 Midnightmagic ran with that patch intensively, for many hours and it never
 showed any failure.

 Then he figured out that it happened a lot more frequently when the
 "entropy pool" was running low.

 Now Samuel "Dcoder" Neves and I poked through the relevant parts of the
 Crypto++ source code, and we didn't see any bug in there that could lead
 to this.

 Hm...

 You know what? Something midnightmagic mentioned about timing made me
 realize that there //is// a way that a timing race could cause exactly
 this observed failure. That is, see how in [//trac/pycryptopp/browser/git
 /src-
 cryptopp/rsa.cpp?annotate=blame&rev=9c884d4ea2c75bc47dc49d4c404bfc5a9fc3b437#L229
 line 229] it sets {{{r}}} equal to a random number read from the operating
 system? And then on line 230 it sets {{{rInv}}} to the multiplicative
 inverse of {{{r}}}? And then later on line 233 it uses {{{r}}} for
 something else. Now, if there is a bug in the kernel such that it
 overwrote the contents of {{{r}}}'s memory //after// line 229 — after the
 call to {{{Randomize()}}} returned — then that would cause this bug. So,
 look for a race condition/insufficient-locking in the NetBSD kernel such
 that reading from /dev/random causes your memory to get written to by the
 kernel after your {{{read()}}} has returned.

 To help find such a bug, please try this patch:

 https://github.com/zooko/pycryptopp/commits/debug-netbsd-rsa-2

 This uses the standard (operating-system-provided) RNG, but does extra
 self-checks in search of the hypothesized "late memory overwrite" that I
 speculate about above.

-- 
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1924#comment:7>
tahoe-lafs <https://tahoe-lafs.org>
secure decentralized storage


More information about the tahoe-lafs-trac-stream mailing list