[tahoe-dev] [tahoe-lafs] #329: dirnodes could cache encrypted/serialized entries for speed
tahoe-lafs
trac at allmydata.org
Tue Jul 7 07:38:33 PDT 2009
#329: dirnodes could cache encrypted/serialized entries for speed
---------------------------+------------------------------------------------
Reporter: warner | Owner: kevan
Type: enhancement | Status: new
Priority: minor | Milestone: undecided
Component: code-dirnodes | Version: 0.8.0
Keywords: dirnode | Launchpad_bug:
---------------------------+------------------------------------------------
Comment(by zooko):
Summarizing into {{{A+Bx}}} is not the way I do it. Instead I look at a
handful of data points, like this:
{{{
benchmarking <function unpack at 0x32c0970>
N: 64, time: best: 0.08, 2th-best: 0.08, ave: 0.09, 2th-
worst: 0.09, worst: 0.10 (of 5), reps/s: 11, ave rate:
732
N: 256, time: best: 0.35, 2th-best: 0.35, ave: 0.36, 2th-
worst: 0.35, worst: 0.39 (of 5), reps/s: 2, ave rate:
714
N: 1024, time: best: 1.51, 2th-best: 1.51, ave: 1.55, 2th-
worst: 1.51, worst: 1.72 (of 5), reps/s: 0, ave rate:
659
N: 4096, time: best: 10.48, 2th-best: 10.49, ave: 10.66, 2th-
worst: 10.53, worst: 11.29 (of 5), reps/s: 0, ave rate:
384
benchmarking <function pack at 0x32c0930>
N: 64, time: best: 0.08, 2th-best: 0.08, ave: 0.09, 2th-
worst: 0.09, worst: 0.09 (of 5), reps/s: 11, ave rate:
747
N: 256, time: best: 0.35, 2th-best: 0.35, ave: 0.35, 2th-
worst: 0.35, worst: 0.35 (of 5), reps/s: 2, ave rate:
728
N: 1024, time: best: 1.52, 2th-best: 1.53, ave: 1.53, 2th-
worst: 1.53, worst: 1.53 (of 5), reps/s: 0, ave rate:
670
N: 4096, time: best: 10.48, 2th-best: 10.49, ave: 10.50, 2th-
worst: 10.52, worst: 10.53 (of 5), reps/s: 0, ave rate:
390
benchmarking <function unpack_and_repack at 0x32c09b0>
N: 64, time: best: 0.08, 2th-best: 0.09, ave: 0.09, 2th-
worst: 0.09, worst: 0.09 (of 5), reps/s: 11, ave rate:
744
N: 256, time: best: 0.35, 2th-best: 0.35, ave: 0.35, 2th-
worst: 0.35, worst: 0.35 (of 5), reps/s: 2, ave rate:
727
N: 1024, time: best: 1.52, 2th-best: 1.52, ave: 1.54, 2th-
worst: 1.53, worst: 1.60 (of 5), reps/s: 0, ave rate:
665
N: 4096, time: best: 10.49, 2th-best: 10.50, ave: 10.51, 2th-
worst: 10.51, worst: 10.52 (of 5), reps/s: 0, ave rate:
390
}}}
This is a good example of why I do it this way: because it isn't linear!
So any {{{A+Bx}}} summary would be off. I already have a patch which
fixes the observable non-linearity there. I guess I might as well commit
that one so that everyone can see it. There -- [3969], entitled
"directories: keep track of your position as you decode netstring after
netstring from an input buffer instead of copying the trailing part | This
makes decoding linear in the number of netstrings instead of O(N^2^).".
After this patch, the same benchmark on the same machine says:
{{{
benchmarking <function unpack at 0x33429b0>
N: 64, time: best: 0.08, 2th-best: 0.08, ave: 0.09, 2th-
worst: 0.08, worst: 0.10 (of 5), reps/s: 11, ave rate:
750
N: 256, time: best: 0.33, 2th-best: 0.33, ave: 0.34, 2th-
worst: 0.33, worst: 0.37 (of 5), reps/s: 2, ave rate:
755
N: 1024, time: best: 1.28, 2th-best: 1.28, ave: 1.32, 2th-
worst: 1.28, worst: 1.48 (of 5), reps/s: 0, ave rate:
776
N: 4096, time: best: 4.83, 2th-best: 4.83, ave: 4.97, 2th-
worst: 4.83, worst: 5.51 (of 5), reps/s: 0, ave rate:
824
benchmarking <function pack at 0x3342970>
N: 64, time: best: 0.08, 2th-best: 0.08, ave: 0.08, 2th-
worst: 0.08, worst: 0.09 (of 5), reps/s: 11, ave rate:
760
N: 256, time: best: 0.33, 2th-best: 0.33, ave: 0.33, 2th-
worst: 0.33, worst: 0.33 (of 5), reps/s: 2, ave rate:
767
N: 1024, time: best: 1.29, 2th-best: 1.29, ave: 1.29, 2th-
worst: 1.29, worst: 1.29 (of 5), reps/s: 0, ave rate:
794
N: 4096, time: best: 4.83, 2th-best: 4.83, ave: 4.83, 2th-
worst: 4.83, worst: 4.84 (of 5), reps/s: 0, ave rate:
847
benchmarking <function unpack_and_repack at 0x33429f0>
N: 64, time: best: 0.08, 2th-best: 0.08, ave: 0.08, 2th-
worst: 0.08, worst: 0.09 (of 5), reps/s: 11, ave rate:
759
N: 256, time: best: 0.33, 2th-best: 0.33, ave: 0.33, 2th-
worst: 0.33, worst: 0.33 (of 5), reps/s: 2, ave rate:
767
N: 1024, time: best: 1.29, 2th-best: 1.29, ave: 1.30, 2th-
worst: 1.29, worst: 1.36 (of 5), reps/s: 0, ave rate:
786
N: 4096, time: best: 4.82, 2th-best: 4.83, ave: 4.85, 2th-
worst: 4.87, worst: 4.90 (of 5), reps/s: 0, ave rate:
844
}}}
As to your question about what's the expensive part, after the
optimization patch from Kevan plus a few that I have here in my sandbox,
benchmarking shows:
{{{
benchmarking <function unpack at 0x33003f0>
N: 64, time: best: 0.07, 2th-best: 0.07, ave: 0.08, 2th-
worst: 0.07, worst: 0.08 (of 5), reps/s: 13, ave rate:
849
N: 256, time: best: 0.29, 2th-best: 0.30, ave: 0.31, 2th-
worst: 0.32, worst: 0.32 (of 5), reps/s: 3, ave rate:
832
N: 1024, time: best: 1.24, 2th-best: 1.24, ave: 1.30, 2th-
worst: 1.27, worst: 1.52 (of 5), reps/s: 0, ave rate:
787
N: 4096, time: best: 4.28, 2th-best: 4.29, ave: 4.40, 2th-
worst: 4.31, worst: 4.82 (of 5), reps/s: 0, ave rate:
931
benchmarking <function pack at 0x33003b0>
N: 64, time: best: 0.07, 2th-best: 0.07, ave: 0.08, 2th-
worst: 0.08, worst: 0.08 (of 5), reps/s: 13, ave rate:
843
N: 256, time: best: 0.29, 2th-best: 0.29, ave: 0.30, 2th-
worst: 0.31, worst: 0.32 (of 5), reps/s: 3, ave rate:
840
N: 1024, time: best: 1.14, 2th-best: 1.14, ave: 1.16, 2th-
worst: 1.17, worst: 1.20 (of 5), reps/s: 0, ave rate:
883
N: 4096, time: best: 4.30, 2th-best: 4.31, ave: 4.53, 2th-
worst: 4.64, worst: 4.99 (of 5), reps/s: 0, ave rate:
904
benchmarking <function unpack_and_repack at 0x3300430>
N: 64, time: best: 0.07, 2th-best: 0.07, ave: 0.07, 2th-
worst: 0.07, worst: 0.08 (of 5), reps/s: 13, ave rate:
860
N: 256, time: best: 0.29, 2th-best: 0.29, ave: 0.29, 2th-
worst: 0.29, worst: 0.29 (of 5), reps/s: 3, ave rate:
880
N: 1024, time: best: 1.14, 2th-best: 1.14, ave: 1.15, 2th-
worst: 1.14, worst: 1.21 (of 5), reps/s: 0, ave rate:
889
N: 4096, time: best: 4.28, 2th-best: 4.28, ave: 4.29, 2th-
worst: 4.29, worst: 4.30 (of 5), reps/s: 0, ave rate:
955
}}}
and profiling shows:
{{{
benchmarking <function unpack at 0x3278430>
N: 64, time: best: 0.08, 2th-best: 0.08, ave: 0.09, 2th-
worst: 0.08, worst: 0.10 (of 5), reps/s: 11, ave rate:
738
N: 256, time: best: 0.34, 2th-best: 0.34, ave: 0.34, 2th-
worst: 0.34, worst: 0.38 (of 5), reps/s: 2, ave rate:
744
N: 1024, time: best: 1.29, 2th-best: 1.29, ave: 1.48, 2th-
worst: 1.46, worst: 2.04 (of 5), reps/s: 0, ave rate:
693
N: 4096, time: best: 4.84, 2th-best: 4.85, ave: 4.97, 2th-
worst: 4.87, worst: 5.45 (of 5), reps/s: 0, ave rate:
824
benchmarking <function pack at 0x32783f0>
N: 64, time: best: 0.08, 2th-best: 0.08, ave: 0.10, 2th-
worst: 0.09, worst: 0.18 (of 5), reps/s: 9, ave rate:
611
N: 256, time: best: 0.33, 2th-best: 0.34, ave: 0.34, 2th-
worst: 0.34, worst: 0.34 (of 5), reps/s: 2, ave rate:
764
N: 1024, time: best: 1.29, 2th-best: 1.29, ave: 1.30, 2th-
worst: 1.30, worst: 1.30 (of 5), reps/s: 0, ave rate:
790
N: 4096, time: best: 4.85, 2th-best: 4.85, ave: 4.88, 2th-
worst: 4.87, worst: 4.96 (of 5), reps/s: 0, ave rate:
840
benchmarking <function unpack_and_repack at 0x3278470>
N: 64, time: best: 0.08, 2th-best: 0.08, ave: 0.08, 2th-
worst: 0.08, worst: 0.09 (of 5), reps/s: 11, ave rate:
754
N: 256, time: best: 0.33, 2th-best: 0.34, ave: 0.34, 2th-
worst: 0.34, worst: 0.34 (of 5), reps/s: 2, ave rate:
764
N: 1024, time: best: 1.30, 2th-best: 1.30, ave: 1.31, 2th-
worst: 1.31, worst: 1.37 (of 5), reps/s: 0, ave rate:
779
N: 4096, time: best: 4.85, 2th-best: 4.87, ave: 4.89, 2th-
worst: 4.91, worst: 4.96 (of 5), reps/s: 0, ave rate:
837
674783 function calls in 4.909 CPU seconds
Ordered by: internal time
List reduced from 75 to 32 due to restriction <32>
ncalls tottime percall cumtime percall filename:lineno(function)
36830 2.422 0.000 2.591 0.000 base32.py:60(b2a_l)
7366 0.527 0.000 1.606 0.000 base32.py:208(a2b_l)
54026 0.154 0.000 0.154 0.000 netstring.py:3(netstring)
20879 0.149 0.000 0.149 0.000 hashutil.py:26(digest)
7366 0.111 0.000 0.111 0.000 hashutil.py:163(_xor)
36830 0.093 0.000 0.093 0.000 string.py:306(join)
3683 0.088 0.000 0.088 0.000 encoder.py:205(iterencode)
58928 0.080 0.000 0.080 0.000 string.py:480(translate)
1 0.076 0.076 1.905 1.905
dirnode.py:248(_pack_contents)
7366 0.066 0.000 0.066 0.000
netstring.py:7(split_netstring)
29464 0.060 0.000 2.141 0.000 base32.py:48(b2a)
49124 0.058 0.000 0.058 0.000 hashutil.py:23(update)
3683 0.056 0.000 0.386 0.000
dirnode.py:196(_encrypt_rwcap)
7366 0.055 0.000 0.199 0.000
hashutil.py:48(tagged_pair_hash)
13513 0.054 0.000 0.134 0.000
hashutil.py:38(tagged_hasher)
3683 0.050 0.000 0.050 0.000 decoder.py:341(raw_decode)
1 0.046 0.046 2.999 2.999
dirnode.py:218(_unpack_contents)
3683 0.042 0.000 0.155 0.000
dirnode.py:207(_decrypt_rwcapdata)
7366 0.040 0.000 1.697 0.000 base32.py:199(a2b)
125222 0.040 0.000 0.040 0.000
assertutil.py:30(precondition)
3683 0.040 0.000 0.151 0.000 hashutil.py:166(hmac)
13513 0.037 0.000 0.286 0.000 hashutil.py:43(tagged_hash)
20879 0.035 0.000 0.035 0.000 hashutil.py:19(__init__)
}}}
Whoops, time for me to go to work! I hope to measure and commit the rest
of my optimization patches today after work. There is one that I really
want other people's input on before I commit it... That one generates the
per-entry IV with a secure hash of the writecap instead of with
{{{os.urandom(16)}}}.
--
Ticket URL: <http://allmydata.org/trac/tahoe/ticket/329#comment:22>
tahoe-lafs <http://allmydata.org>
secure decentralized file storage grid
More information about the tahoe-dev
mailing list