close Warning: Can't synchronize with repository "(default)" (Unsupported version control system "darcs": Can't find an appropriate component, maybe the corresponding plugin was not enabled? ). Look in the Trac log for more information.

Changes between Version 2 and Version 3 of WikiStart


Ignore:
Timestamp:
2010-03-12 06:57:47 (15 years ago)
Author:
zooko
Comment:

whee

Legend:

Unmodified
Added
Removed
Modified
  • WikiStart

    v2 v3  
    1616}}}
    1717
     18This worked fine as long as the total amount of bytes accumulated and the number of separate {{{add_data()}}} events stay small, but it has O(N^2^) behavior and performs terrible if those numbers get large. Here are some benchmarks generated by running {{{python -OOu -c 'from stringchain.bench import bench; bench.quick_bench()'}}} as instructed by [source:README.txt the README.txt file]:
     19
     20{{{
     21_buildup init_naive
     22N:   65536, time: best:    0.00,   2th-best:    0.00, ave:    0.00,   2th-worst:    0.00, worst:    0.00 (of      5), reps/s:    890, ave rate: 58350579
     23N:  131072, time: best:    0.00,   2th-best:    0.00, ave:    0.00,   2th-worst:    0.00, worst:    0.00 (of      5), reps/s:    265, ave rate: 34800398
     24N:  262144, time: best:    0.01,   2th-best:    0.01, ave:    0.01,   2th-worst:    0.01, worst:    0.01 (of      5), reps/s:     79, ave rate: 20745346
     25N:  524288, time: best:    0.05,   2th-best:    0.05, ave:    0.05,   2th-worst:    0.05, worst:    0.05 (of      5), reps/s:     20, ave rate: 10823850
     26_buildup init_strch
     27N:   65536, time: best:    0.00,   2th-best:    0.00, ave:    0.00,   2th-worst:    0.00, worst:    0.00 (of      5), reps/s:  25543, ave rate: 1674043282
     28N:  131072, time: best:    0.00,   2th-best:    0.00, ave:    0.00,   2th-worst:    0.00, worst:    0.00 (of      5), reps/s:  14179, ave rate: 1858538925
     29N:  262144, time: best:    0.00,   2th-best:    0.00, ave:    0.00,   2th-worst:    0.00, worst:    0.00 (of      5), reps/s:   8016, ave rate: 2101513050
     30N:  524288, time: best:    0.00,   2th-best:    0.00, ave:    0.00,   2th-worst:    0.00, worst:    0.00 (of      5), reps/s:   4108, ave rate: 2154215572
     31_consume init_naive_loaded
     32N:   65536, time: best:    0.00,   2th-best:    0.00, ave:    0.00,   2th-worst:    0.00, worst:    0.00 (of      5), reps/s:    931, ave rate: 61037862
     33N:  131072, time: best:    0.00,   2th-best:    0.00, ave:    0.00,   2th-worst:    0.00, worst:    0.00 (of      5), reps/s:    270, ave rate: 35454393
     34N:  262144, time: best:    0.01,   2th-best:    0.01, ave:    0.01,   2th-worst:    0.01, worst:    0.01 (of      5), reps/s:     74, ave rate: 19471963
     35N:  524288, time: best:    0.05,   2th-best:    0.05, ave:    0.05,   2th-worst:    0.05, worst:    0.06 (of      5), reps/s:     19, ave rate: 10146747
     36_consume init_strch_loaded
     37N:   65536, time: best:    0.00,   2th-best:    0.00, ave:    0.00,   2th-worst:    0.00, worst:    0.00 (of      5), reps/s:   4309, ave rate: 282447500
     38N:  131072, time: best:    0.00,   2th-best:    0.00, ave:    0.00,   2th-worst:    0.00, worst:    0.00 (of      5), reps/s:   2313, ave rate: 303263357
     39N:  262144, time: best:    0.00,   2th-best:    0.00, ave:    0.00,   2th-worst:    0.00, worst:    0.00 (of      5), reps/s:   1186, ave rate: 311159052
     40N:  524288, time: best:    0.00,   2th-best:    0.00, ave:    0.00,   2th-worst:    0.00, worst:    0.00 (of      5), reps/s:    606, ave rate: 317814669
     41_randomy init_naive
     42N:   65536, time: best:    0.00,   2th-best:    0.00, ave:    0.00,   2th-worst:    0.00, worst:    0.00 (of      5), reps/s:    479, ave rate: 31450561
     43N:  131072, time: best:    0.01,   2th-best:    0.01, ave:    0.01,   2th-worst:    0.01, worst:    0.01 (of      5), reps/s:    140, ave rate: 18461191
     44N:  262144, time: best:    0.02,   2th-best:    0.02, ave:    0.02,   2th-worst:    0.03, worst:    0.03 (of      5), reps/s:     42, ave rate: 11127714
     45N:  524288, time: best:    0.06,   2th-best:    0.07, ave:    0.08,   2th-worst:    0.08, worst:    0.09 (of      5), reps/s:     13, ave rate:  6906341
     46_randomy init_strch
     47N:   65536, time: best:    0.00,   2th-best:    0.00, ave:    0.00,   2th-worst:    0.00, worst:    0.00 (of      5), reps/s:    973, ave rate: 63827127
     48N:  131072, time: best:    0.00,   2th-best:    0.00, ave:    0.00,   2th-worst:    0.00, worst:    0.00 (of      5), reps/s:    495, ave rate: 64970669
     49N:  262144, time: best:    0.00,   2th-best:    0.00, ave:    0.00,   2th-worst:    0.00, worst:    0.00 (of      5), reps/s:    239, ave rate: 62913360
     50N:  524288, time: best:    0.01,   2th-best:    0.01, ave:    0.01,   2th-worst:    0.01, worst:    0.01 (of      5), reps/s:    121, ave rate: 63811569
     51}}}
     52
     53The naive approach is slower than the !StringChain library, and the bigger the dataset the slower it goes. The !StringChain library is scalable (with regard to these benchmarks at least...).
     54
     55Okay how do you use it? It is very simple -- see [source:stringchain/stringchain.py] and let me know if that interface doesn't fit your use case.
     56
    1857== Starting Points ==
    1958