[tahoe-dev] pycryptopp benchmarks for Marvell Orion 88F5182 ARM9 and Dual Core Cortex A9 1.2 GHz
Johannes Nix
Johannes.Nix at gmx.net
Sun Mar 25 22:36:24 UTC 2012
Hi Zooko,
> If anybody else has a low-power CPU, please run the
> pycryptopp benchmarks and report what you get.
>
here is what I get:
-------------------------------------------------------------------------------
1) for 500 MHz Marvell Orion 88F5182, ARM9-CPU
on DNS-323 NAS, 64 MB RAM, armel architecture
==> this runs quite slow. Because of low RAM, I tried it with
SCHED_FIFO real-time scheduling, result seems about the same
running bench
<class 'pycryptopp.bench.bench_sigs.ECDSA256'>
generate key
best: 2.429e+00, 24th-best: 2.445e+00, mean: 2.599e+00, 24th-worst:
2.484e+00, worst: 1.501e+01 (of 92) sign
best: 4.229e+01, 1th-best: 4.229e+01, mean: 4.551e+01, 1th-worst:
4.872e+01, worst: 4.872e+01 (of 2) verify
best: 1.011e+02, 1th-best: 1.011e+02, mean: 1.011e+02, 1th-worst:
1.011e+02, worst: 1.011e+02 (of 1)
<class 'pycryptopp.bench.bench_sigs.Ed25519'>
generate key
best: 3.441e+01, 2th-best: 3.441e+01, mean: 3.621e+01, 2th-worst:
4.694e+01, worst: 4.694e+01 (of 7) sign
best: 3.527e+01, 1th-best: 3.527e+01, mean: 3.652e+01, 1th-worst:
3.777e+01, worst: 3.777e+01 (of 2) verify
best: 1.200e+02, 1th-best: 1.200e+02, mean: 1.200e+02, 1th-worst:
1.200e+02, worst: 1.200e+02 (of 1)
<class 'pycryptopp.bench.bench_sigs.RSA2048'>
generate key
best: 5.471e+03, 1th-best: 5.471e+03, mean: 5.471e+03, 1th-worst:
5.471e+03, worst: 5.471e+03 (of 1) sign
best: 2.021e+02, 1th-best: 2.021e+02, mean: 2.021e+02, 1th-worst:
2.021e+02, worst: 2.021e+02 (of 1) verify
best: 3.870e+00, 1th-best: 3.870e+00, mean: 3.870e+00, 1th-worst:
3.870e+00, worst: 3.870e+00 (of 1)
<class 'pycryptopp.bench.bench_sigs.RSA3248'>
generate key
best: 5.892e+04, 1th-best: 5.892e+04, mean: 5.892e+04, 1th-worst:
5.892e+04, worst: 5.892e+04 (of 1) sign
best: 9.366e+02, 1th-best: 9.366e+02, mean: 9.366e+02, 1th-worst:
9.366e+02, worst: 9.366e+02 (of 1) verify
best: 9.484e+00, 1th-best: 9.484e+00, mean: 9.484e+00, 1th-worst:
9.484e+00, worst: 9.484e+00 (of 1)
milliseconds per operation
all results are in time units per N
time units per second: 1000; seconds per time unit: 0.001
<AES-128>
small (1000 B)
best: 4.559e+02, 3th-best: 4.570e+02, mean: 4.929e+02, 3th-worst:
5.431e+02, worst: 5.510e+02 (of 8)
medium (10000 B)
best: 3.414e+02, 1th-best: 3.414e+02, mean: 3.414e+02, 1th-worst:
3.414e+02, worst: 3.414e+02 (of 1)
large (100000 B)
best: 2.802e+02, 1th-best: 2.802e+02, mean: 2.802e+02, 1th-worst:
2.802e+02, worst: 2.802e+02 (of 1)
<AES-256>
small (1000 B)
best: 5.269e+02, 3th-best: 5.350e+02, mean: 5.819e+02, 3th-worst:
6.521e+02, worst: 6.621e+02 (of 8)
medium (10000 B)
best: 3.152e+02, 1th-best: 3.152e+02, mean: 3.152e+02, 1th-worst:
3.152e+02, worst: 3.152e+02 (of 1)
large (100000 B)
best: 4.381e+02, 1th-best: 4.381e+02, mean: 4.381e+02, 1th-worst:
4.381e+02, worst: 4.381e+02 (of 1)
<XSalsa20-256>
small (1000 B)
best: 2.780e+02, 3th-best: 2.780e+02, mean: 2.813e+02, 3th-worst:
2.849e+02, worst: 2.861e+02 (of 8)
medium (10000 B)
best: 1.055e+02, 1th-best: 1.055e+02, mean: 1.055e+02, 1th-worst:
1.055e+02, worst: 1.055e+02 (of 1)
large (100000 B)
best: 8.701e+01, 1th-best: 8.701e+01, mean: 8.701e+01, 1th-worst:
8.701e+01, worst: 8.701e+01 (of 1)
nanoseconds per byte crypted
all results are in time units per N
time units per second: 1000000000; seconds per time unit: 1E-9
<class 'pycryptopp.bench.bench_hashes.SHA256'>
small (1000 B)
best: 2.830e+02, 3th-best: 2.830e+02, mean: 2.857e+02, 3th-worst:
2.859e+02, worst: 2.971e+02 (of 8)
medium (10000 B)
best: 1.571e+02, 1th-best: 1.571e+02, mean: 1.571e+02, 1th-worst:
1.571e+02, worst: 1.571e+02 (of 1)
large (100000 B)
best: 1.375e+02, 1th-best: 1.375e+02, mean: 1.375e+02, 1th-worst:
1.375e+02, worst: 1.375e+02 (of 1)
<class 'pycryptopp.bench.bench_hashes.hashlibSHA256'>
small (1000 B)
best: 3.510e+02, 3th-best: 3.510e+02, mean: 3.886e+02, 3th-worst:
4.110e+02, worst: 5.820e+02 (of 8)
medium (10000 B)
best: 1.800e+02, 1th-best: 1.800e+02, mean: 1.800e+02, 1th-worst:
1.800e+02, worst: 1.800e+02 (of 1)
large (100000 B)
best: 1.628e+02, 1th-best: 1.628e+02, mean: 1.628e+02, 1th-worst:
1.628e+02, worst: 1.628e+02 (of 1)
nanoseconds per byte hashed
all results are in time units per N
time units per second: 1000000000; seconds per time unit: 1E-9
-------------------------------------------------------------------------------
2)Dual-core ARM® Cortex-A9 with 1.2 GHz clock on
PandaBoard ES = OMAP4 SoC, 1GB RAM, Armel / armv7 architecture,
Texas Instruments Kernel 3.1.0-1282-omap4 for Ubuntu.
With pyutil, I get this stack trace:
running bench
<class 'pycryptopp.bench.bench_sigs.ECDSA256'>
generate key
Traceback (most recent call last):
File "setup.py", line 453, in <module>
cmdclass=commands,
File "/usr/lib/python2.7/distutils/core.py", line 152, in setup
dist.run_commands()
File "/usr/lib/python2.7/distutils/dist.py", line 953, in run_commands
self.run_command(cmd)
File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
cmd_obj.run()
File "setup.py", line 422, in run
bench_algs.bench(MAXTIME=1.0)
File "build/lib.linux-armv7l-2.7/pycryptopp/bench/bench_algs.py",
line 4, in bench bench_sigs.bench(MAXTIME)
File "build/lib.linux-armv7l-2.7/pycryptopp/bench/bench_sigs.py",
line 138, in bench bench_sigs(MAXTIME)
File "build/lib.linux-armv7l-2.7/pycryptopp/bench/bench_sigs.py",
line 126, in bench_sigs rep_bench(ob.gen, 4, UNITS_PER_SECOND=1000,
MAXTIME=MAXTIME, MAXREPS=100) File
"/usr/local/lib/python2.7/dist-packages/pyutil-1.8.7-py2.7.egg/pyutil/benchutil.py",
line 119, in rep_bench tl = bench_it(func, n, profile=profile,
profresults=profresults) File
"/usr/local/lib/python2.7/dist-packages/pyutil-1.8.7-py2.7.egg/pyutil/benchutil.py",
line 188, in bench_it _assert(timeelapsed*MARGINOFERROR >
worstemptymeasure, "Invoking func %s(%s) took only %0.20f seconds, but
we cannot accurately measure times much less than %0.20f seconds.
Therefore we cannot produce good results here. Try benchmarking a more
time-consuming variant." % (func, n, timeelapsed, worstemptymeasure,))
File
"/usr/local/lib/python2.7/dist-packages/pyutil-1.8.7-py2.7.egg/pyutil/assertutil.py",
line 24, in _assert raise AssertionError, "".join(msgbuf)
AssertionError: 'Invoking func <bound method ECDSA256.gen of
<pycryptopp.bench.bench_sigs.ECDSA256 object at 0x1952590>>(4) took
only 0.00006079673767089844 seconds...imes much less than
0.00118994712829589844 seconds. Therefore we cannot produce good
results here. Try benchmarking a more time-consuming variant.' <type
'str'>
without pyutil in PYTHONPATH, I get:
-------------------------------------------------------------------------------
running bench
<class 'pycryptopp.bench.bench_sigs.ECDSA256'>
generate key
mean: 2.959e-03 (of 100)
sign
mean: 1.172e+00 (of 7)
verify
mean: 2.274e+01 (of 1)
<class 'pycryptopp.bench.bench_sigs.Ed25519'>
generate key
mean: 2.748e-01 (of 30)
sign
mean: 1.424e+00 (of 6)
verify
mean: 2.859e+01 (of 1)
<class 'pycryptopp.bench.bench_sigs.RSA2048'>
generate key
mean: 1.973e+03 (of 1)
sign
mean: 5.337e+01 (of 1)
verify
mean: 6.137e-02 (of 13)
<class 'pycryptopp.bench.bench_sigs.RSA3248'>
generate key
mean: 1.220e+04 (of 1)
sign
mean: 2.725e+02 (of 1)
verify
mean: 4.839e-01 (of 5)
milliseconds per operation
time units per second: 1000; seconds per time unit: 0.001
<AES-128>
small (1000 B)
mean: 3.447e-01 (of 100)
medium (10000 B)
mean: 4.355e-01 (of 100)
large (100000 B)
mean: 4.234e-01 (of 100)
<AES-256>
small (1000 B)
mean: 7.079e-01 (of 100)
medium (10000 B)
mean: 5.539e-01 (of 100)
large (100000 B)
mean: 5.800e-01 (of 100)
<XSalsa20-256>
small (1000 B)
mean: 1.709e-01 (of 100)
medium (10000 B)
mean: 1.578e-01 (of 100)
large (100000 B)
mean: 1.798e-01 (of 100)
nanoseconds per byte crypted
time units per second: 1000000000; seconds per time unit: 1E-9
<class 'pycryptopp.bench.bench_hashes.SHA256'>
small (1000 B)
mean: 3.478e-01 (of 100)
medium (10000 B)
mean: 2.545e-01 (of 100)
large (100000 B)
mean: 2.932e-01 (of 100)
<class 'pycryptopp.bench.bench_hashes.hashlibSHA256'>
small (1000 B)
mean: 6.409e-01 (of 100)
medium (10000 B)
mean: 4.181e-01 (of 100)
large (100000 B)
mean: 3.891e-01 (of 100)
nanoseconds per byte hashed
time units per second: 1000000000; seconds per time unit: 1E-9
-------------------------------------------------------------------------------
Is the time of only 6ns given in the stack trace realistic?
Regards,
Johannes
On Wed, 21 Mar 2012 09:38:45 -0600
"Zooko Wilcox-O'Hearn" <zooko at zooko.com> wrote:
> Folks:
>
> Someone asked me in private email if pycryptopp included ECDSA. The
> answer is that it does, but it was never really finished, never used,
> and is deprecated.
>
> However, I took the opportunity to do some simple benchmark of
> pycryptopp's unused ECDSA-192 vs. its new Ed25519 and its current
> RSA-2048. I added a benchmarking script and configured buildbot to run
> it automatically. Then I amended it
> [e6833d434fbad94d7fff753ea6293e99d67a3f7c] to do 256-bit ECDSA, which
> is comparably strong to Ed25519, and 3248-bit RSA, which is comparably
> strong to Ed25519 (according to ECRYPT II --
> http://www.keylength.com/en/3/ ). Also I added benchmarks for the
> ciphers (AES-128, AES-256, and XSalsa20) and SHA-256 and the Python
> Standard Library's "hashlib" implementation of SHA-256.
>
> You can browse the buildbot
> (https://tahoe-lafs.org/buildbot-pycryptopp/waterfall) for the
> results, but here is my summary.
>
> 1. The important thing is that the RSA-2048 that we're currently using
> for digital signatures takes a lot of CPU to generate a new keypair,
> which we do on every mkdir and every creation of a mutable file.
> (Every mutable directory and every mutable file comes with its own
> keypair, so that you can grant someone write authority to that one by
> giving them the private key without thereby granting them write
> authority to any other file or directory.)
>
> For example, here's the results on the Atlas buildslave (donated by
> Atlas Networks and operated by Brian Warner -- thank you both!)
>
> https://tahoe-lafs.org/buildbot-pycryptopp/builders/atlas1%20natty/builds/96/steps/bench/logs/stdio
>
> It took 394 milliseconds to generate a keypair. Here is the
> measurement of the same operation on other machines:
>
> https://tahoe-lafs.org/buildbot-pycryptopp/builders/marcusw%20cygwin/builds/85/steps/bench/logs/stdio
>
> 453 milliseconds
>
> https://tahoe-lafs.org/buildbot-pycryptopp/builders/Kyle%20OpenBSD-amd64/builds/99/steps/bench/logs/stdio
>
> 114 milliseconds
>
> https://tahoe-lafs.org/buildbot-pycryptopp/builders/francois-ts109-armv5tel%20syslib/builds/95/steps/bench/logs/stdio
>
> 5879 milliseconds!
>
> (That's François's little ARM box.)
>
> 2. RSA-2048 is not strong enough for long-term security. For our
> target security level of 2¹²⁸, we would require RSA-3248. That is even
> more ridiculously expensive to generate a key. On the Atlas buildslave
> it takes 6000 milliseconds, and on the little ARM box it takes 50,000
> milliseconds!
>
> 3. Ed25519 is much faster at generating keys -- Atlas server: 0.02
> milliseconds, ARM box: 32 milliseconds. By inventing a new mutable
> file format which uses Ed25519 (which is ticket #217), we can
> noticeably reduce the delay in creating a new mutable file or
> directory. It will also simplify the file format since Ed25519 private
> keys and public keys are are merely 32 bytes each. This is small
> enough that we can stick the whole private key into a write cap and
> the whole public key into a read cap. This will be simpler than the
> current scheme in which we tie RSA public keys—which are hundreds of
> bytes—into the read caps by sticking the hash of the public key into
> the read cap, encrypting the public key with said hash, and storing
> the ciphertext of the public key to the storage server.
>
> 4. Our strategy has always been to choose algorithms that *can* be
> implemented with great efficiency in the long run, even if the current
> implementation that we are using is merely adequate. This is most
> clearly shown by our choice of Ed25519 and the current "reference
> implementation" thereof. The current reference implementation is a tad
> slower than the current ECDSA implementation, but still adequate for
> current needs. There exist more aggressively optimized implementations
> of Ed25519 which are up to 100X faster! So if we ever need a more
> efficient digital signature, we can get it by upgrading to a more
> optimized implementation without losing compatibility with extant
> versions of Tahoe-LAFS.
>
> 5. Unless there's something wrong with my benchmarks, encryption isn't
> quite as fast as I'd like on François's little low-power ARM box. 340
> μs per byte? That almost 1 ms for 3 bytes! That means it would take a
> whole second to en-/de-crypt 3000 bytes? If it were really that slow,
> I would think that François's little ARM box would be uncomfortable to
> actually use. However, I remember using it (he made it publicly
> available as a gateway on the Test Grid), so it wasn't totally
> unusable. If anybody else has a low-power CPU, please run the
> pycryptopp benchmarks and report what you get.
>
> Regards,
>
> Zooko
>
> patches mentioned in this letter:
>
> https://tahoe-lafs.org/trac/pycryptopp/changeset/e6833d434fbad94d7fff753ea6293e99d67a3f7c/git
>
> tickets mentioned in this letter:
>
> https://tahoe-lafs.org/trac/tahoe-lafs/ticket/217# Ed25519-based
> mutable files -- fast file creation, possibly smaller URLs
> _______________________________________________
> tahoe-dev mailing list
> tahoe-dev at tahoe-lafs.org
> http://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev
More information about the tahoe-dev
mailing list