[tahoe-dev] [pycryptopp] #2: choose+implement EC-DSA KDF: deterministic generation of private key from small seed
pycryptopp
trac at allmydata.org
Sun Feb 28 15:31:41 PST 2010
#2: choose+implement EC-DSA KDF: deterministic generation of private key from
small seed
------------------------+---------------------------------------------------
Reporter: zooko | Owner: zooko
Type: enhancement | Status: new
Priority: major | Version: 0.4.0
Keywords: | Launchpad_bug:
------------------------+---------------------------------------------------
Comment(by warner):
We chatted on IRC the other week about what the next step should be.
Deciding
upon and implementing the KDF seems like the best one. Testing the KDF is
a
challenge. Here's my current favorite approach.
* the C function (in {{{ecdsamodule.cpp}}}) to construct an ECDSA signing
key should take a longint argument named {{{exponent=}}}, and raise a
{{{ValueError}}} (or maybe {{{TypeError}}}) if its numeric value is
outside the acceptable range (I think this is 1..q-2, where "q" is the
size of the prime subgroup, but I may be wrong).
* the C module should also export a constant, next to the constructor
function, which allows Python code to learn what "q" is.
* pycryptopp must also expose Poly1305 and/or XSalsa20, or whatever hash
function we plan to use for the KDF.
* the Python module should provide an {{{ECDSASigningKey}}} class, whose
constructor takes an argument named seed=, which accepts a variable
length
string.
* internally, this constructor should call a python function named
{{{KDF}}}, which takes the seed and "q", and invokes whatever hash
functions, expansions, modulus computations, and try-try-again loops
as
necessary until it comes up with an exponent in the correct range.
Then
it should call the C-based constructor.
* we might like to say that omitting {{{seed=}}} causes the python code
to
use {{{os.urandom(ceil(log2(q))))}}} as a seed.
This moves the KDF logic out of C and into Python, where we can test it
more
easily.
If we get concerned about speed, we could implement the KDF logic in C as
well, but retain {{{exponent=}}} as an option (callers would either pass
{{{exponent=}}} or {{{seed=}}} but not both), and then have tests which
assert equality of keys generated with the C {{{seed=}}} argument against
keys generated with {{{exponent=python.KDF(seed)}}}, to confirm that the
C-based KDF code behaves the same way as the sample Python implementation.
As to choice of KDF (specifically the "turn N random bits into a suitable
random secret exponent" sort of derivation function), I think the four
basic
variants we discussed (ignoring choice of hash function) were:
* 1 (try-try-again): guess=HASH(seed+counter), truncate to ceil(log2(q))
bits, test against limit, if out of bounds then increment counter and
try
again
* 2 (bigmodulo): expand the seed to perhaps 2*log2(q) bits, reduce it
modulo
the limit, done. (actually it'd be more like e=1+(largesecret%(q-3)) or
something)
* 3 (truncate): expand the seed to floor(log2(q)) bits, done.
* 4 (bigmodulo-and-try-try-again): expand the seed to 2*log2(q) bits, if
that is larger than q*int(guess/q) (i.e. in the "tail"), try again,
else
reduce modulo q and return
The try-try-again choices (1 and 4) have the unfortunate property that,
depending upon how close the group size is to a power of two, they might
be
very hard to test (if q=2**128-9 then the chances of triggering the loop
are
negligible, so writing a test to exercise that code would be hard). The
opposite extreme is occasional cases of looping a large number of times
(if
q=2**128+9 then nearly half of the 129-bit keys would be out-of range,
causing a loop, and although the average number of loops should be two,
the
exponential distribution of loop counts means that sometimes you'll get 10
or
100 or 1000 loops).
The other choices (2 and 3) have the unfortunate property that there will
be
some amount of bias in the resulting key, although at least David-Sarah
believes that the 2-bigmodulo variant can reduce the bias down to a
negligible size (less than a bit). 2-bigmodulo needs an efficient bigint
modulo operation, which is probably not a big deal in the C code (and
certainly not a problem up at the Python level). 3-truncate has up to one
bit
of bias, reducing a 128-bit key to more like a 127-bit key, but is
super-simple and super-fast.
--
Ticket URL: <http://allmydata.org/trac/pycryptopp/ticket/2#comment:13>
pycryptopp <http://allmydata.org/trac/pycryptopp>
Python bindings for the Crypto++ library
More information about the tahoe-dev
mailing list