[tahoe-dev] note about hash-based digital signatures
Brian Warner
warner at lothar.com
Wed Jun 23 12:01:11 PDT 2010
On 6/23/10 7:18 AM, David-Sarah Hopwood wrote:
> David-Sarah Hopwood wrote:
>
> Ah, but it will work for a multi-layer Merkle tree scheme, such as
> GMSS: if keys are generated deterministically from a seed, then the
> signatures certifying keys at upper layers are also deterministic, so
> there's no key-reuse problem for those.
Right, exactly. The basic scheme would look something like this:
* set K=1024
* build a K-ary tree with 2^M=2^256 leaves. Each leaf corresponds to a
possible message.
* define a deterministic keyed function from leaf number to keypair
* define a deterministic keyed function from node number to keypair
* publish the root pubkey as the readcap. Retain the secret key (used
as an input to the above deterministic functions) as the writecap.
* to sign a given message:
* identify the DEPTH=ln(base=K, 2^256) =26ish tree nodes along the
path from root to leaf
* generate the one privkey for the message leaf, use it to sign the
message itself
* for all nodes on the path, from bottom to top:
[
* generate all K pubkeys for the node's children, concatenate them
into a string, treat that as a message
* sign the message with the parent node's privkey
]
As zooko pointed out, the leaf signature can be optimized away: since
each message gets a unique pubkey, it suffices to reveal the
preimage/privkey for that pubkey. This reduces the size of the leaf
signature down to a single hash.
Assuming a 256-bit Merkle-signature scheme (with a "0" and a "1"
preimage for each bit), it takes 512 hashes to build all the
privkey/preimages, and then 512 hashes to build the pubkeys. Each
signature requires computing and publishing (revealing) 256 preimage
hashes.
Generating the readcap is pretty easy: you build the K pubkeys for the
top node (4*256 small hashes each), hash each one down (K large hashes
of 512*32 bytes each), then hash those lists into a single value (one
large hash of K*32 bytes). So 1M small hashes, 1024 hashes of 16KiB
each, and one hash of 32KiB bytes.
For K=1024 and M=256, the message signature consists of the leaf
signature (one hash), and 26 nodes. Each node contains one signature
(256 preimages), one pubkey (512 hashes), and 1023 hashes-of-pubkeys. So
the overall message signature requires you publish about 46567 hashes,
or 1.5MB.
To compute that signature, you have to actually generate all those
pubkeys (even though you only have to publish the hashes of most of
them). You must generate DEPTH*K (about 26624) pubkeys. Each one
requires generating 512 preimages (one hash each, H(secret+nodenum+bit))
and then hashing each one to get the post-image, which is 1024 hashes
per pubkey, or about 27M hashes (these are all tiny hashes: 256 bytes
input). You have to hash each pubkey down (DEPTH*K large hashes of
512*32 bytes each), and then for each node, hash those lists into a
single value for the parent to sign (DEPTH large hashes of 1024*32
bytes).
So starting with:
K: branching factor, e.g. 1024
M: size of message in bits, num-leaves=2^M, e.g. 256
DEPTH: depth of tree, ln(K,2^M)
H: size of a hash in bytes, e.g. 32
building the readcap requires computation:
K*4*M tiny hashes (of H bytes each)
K large hashes (of 2*M*H bytes each)
1 large hash (of K*H bytes)
signing a message requires computation:
DEPTH*K*4*M tiny hashes (of H bytes each)
DEPTH larger hashes (of K*H bytes each)
signing a message requires publication:
DEPTH*(3*M+(K-1))*H bytes of hashes
Here's the optimization table: the smallest publish size occurs at K=256
(1MB of data, 8.4M hashes), while the fastest computation (for both the
readcap generation and per-write signature generation) occurs at K=2
(523k hashes), but K=4 is almost as good and drops the publish size to
3.16MB of data. Generation script is attached; please check my math!
% python /tmp/hashsigs.py
k=branch readcap | signature
factor small pubkey | small large large publish
hashes hashes | hashes hashes hashsize size
======== ====== ====== ======== ====== ======== =======
2 2.05k 2 524.29k 256 64 6.30MB
4 4.10k 4 524.29k 128 128 3.16MB <--
8 8.19k 8 704.51k 86 256 2.13MB
16 16.38k 16 1.05M 64 512 1.60MB
32 32.77k 32 1.70M 52 1024 1.33MB
64 65.54k 64 2.82M 43 2048 1.14MB
128 131.07k 128 4.85M 37 4096 1.06MB
256 262.14k 256 8.39M 32 8192 1.05MB <--
512 524.29k 512 15.20M 29 16384 1.19MB
1024 1.05M 1.02k 27.26M 26 32768 1.49MB
2048 2.10M 2.05k 50.33M 24 65536 2.16MB
4096 4.19M 4.10k 92.27M 22 131072 3.42MB
8192 8.39M 8.19k 167.77M 20 262144 5.73MB
16384 16.78M 16.38k 318.77M 19 524288 10.43MB
32768 33.55M 32.77k 603.98M 18 1048576 19.32MB
65536 67.11M 65.54k 1.07G 16 2097152 33.95MB
131072 134.22M 131.07k 2.15G 16 4194304 67.50MB
262144 268.44M 262.14k 4.03G 15 8388608 126.20MB
524288 536.87M 524.29k 7.52G 14 16777216 235.22MB
(this doesn't take account the leaf-signature-optimization, but that
only saves you a few hundred hashes).
cheers,
-Brian
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: hashsigs.py
URL: <http://allmydata.org/pipermail/tahoe-dev/attachments/20100623/b1c4d075/attachment.ksh>
More information about the tahoe-dev
mailing list