| 18 | This worked fine as long as the total amount of bytes accumulated and the number of separate {{{add_data()}}} events stay small, but it has O(N^2^) behavior and performs terrible if those numbers get large. Here are some benchmarks generated by running {{{python -OOu -c 'from stringchain.bench import bench; bench.quick_bench()'}}} as instructed by [source:README.txt the README.txt file]: |
| 19 | |
| 20 | {{{ |
| 21 | _buildup init_naive |
| 22 | N: 65536, time: best: 0.00, 2th-best: 0.00, ave: 0.00, 2th-worst: 0.00, worst: 0.00 (of 5), reps/s: 890, ave rate: 58350579 |
| 23 | N: 131072, time: best: 0.00, 2th-best: 0.00, ave: 0.00, 2th-worst: 0.00, worst: 0.00 (of 5), reps/s: 265, ave rate: 34800398 |
| 24 | N: 262144, time: best: 0.01, 2th-best: 0.01, ave: 0.01, 2th-worst: 0.01, worst: 0.01 (of 5), reps/s: 79, ave rate: 20745346 |
| 25 | N: 524288, time: best: 0.05, 2th-best: 0.05, ave: 0.05, 2th-worst: 0.05, worst: 0.05 (of 5), reps/s: 20, ave rate: 10823850 |
| 26 | _buildup init_strch |
| 27 | N: 65536, time: best: 0.00, 2th-best: 0.00, ave: 0.00, 2th-worst: 0.00, worst: 0.00 (of 5), reps/s: 25543, ave rate: 1674043282 |
| 28 | N: 131072, time: best: 0.00, 2th-best: 0.00, ave: 0.00, 2th-worst: 0.00, worst: 0.00 (of 5), reps/s: 14179, ave rate: 1858538925 |
| 29 | N: 262144, time: best: 0.00, 2th-best: 0.00, ave: 0.00, 2th-worst: 0.00, worst: 0.00 (of 5), reps/s: 8016, ave rate: 2101513050 |
| 30 | N: 524288, time: best: 0.00, 2th-best: 0.00, ave: 0.00, 2th-worst: 0.00, worst: 0.00 (of 5), reps/s: 4108, ave rate: 2154215572 |
| 31 | _consume init_naive_loaded |
| 32 | N: 65536, time: best: 0.00, 2th-best: 0.00, ave: 0.00, 2th-worst: 0.00, worst: 0.00 (of 5), reps/s: 931, ave rate: 61037862 |
| 33 | N: 131072, time: best: 0.00, 2th-best: 0.00, ave: 0.00, 2th-worst: 0.00, worst: 0.00 (of 5), reps/s: 270, ave rate: 35454393 |
| 34 | N: 262144, time: best: 0.01, 2th-best: 0.01, ave: 0.01, 2th-worst: 0.01, worst: 0.01 (of 5), reps/s: 74, ave rate: 19471963 |
| 35 | N: 524288, time: best: 0.05, 2th-best: 0.05, ave: 0.05, 2th-worst: 0.05, worst: 0.06 (of 5), reps/s: 19, ave rate: 10146747 |
| 36 | _consume init_strch_loaded |
| 37 | N: 65536, time: best: 0.00, 2th-best: 0.00, ave: 0.00, 2th-worst: 0.00, worst: 0.00 (of 5), reps/s: 4309, ave rate: 282447500 |
| 38 | N: 131072, time: best: 0.00, 2th-best: 0.00, ave: 0.00, 2th-worst: 0.00, worst: 0.00 (of 5), reps/s: 2313, ave rate: 303263357 |
| 39 | N: 262144, time: best: 0.00, 2th-best: 0.00, ave: 0.00, 2th-worst: 0.00, worst: 0.00 (of 5), reps/s: 1186, ave rate: 311159052 |
| 40 | N: 524288, time: best: 0.00, 2th-best: 0.00, ave: 0.00, 2th-worst: 0.00, worst: 0.00 (of 5), reps/s: 606, ave rate: 317814669 |
| 41 | _randomy init_naive |
| 42 | N: 65536, time: best: 0.00, 2th-best: 0.00, ave: 0.00, 2th-worst: 0.00, worst: 0.00 (of 5), reps/s: 479, ave rate: 31450561 |
| 43 | N: 131072, time: best: 0.01, 2th-best: 0.01, ave: 0.01, 2th-worst: 0.01, worst: 0.01 (of 5), reps/s: 140, ave rate: 18461191 |
| 44 | N: 262144, time: best: 0.02, 2th-best: 0.02, ave: 0.02, 2th-worst: 0.03, worst: 0.03 (of 5), reps/s: 42, ave rate: 11127714 |
| 45 | N: 524288, time: best: 0.06, 2th-best: 0.07, ave: 0.08, 2th-worst: 0.08, worst: 0.09 (of 5), reps/s: 13, ave rate: 6906341 |
| 46 | _randomy init_strch |
| 47 | N: 65536, time: best: 0.00, 2th-best: 0.00, ave: 0.00, 2th-worst: 0.00, worst: 0.00 (of 5), reps/s: 973, ave rate: 63827127 |
| 48 | N: 131072, time: best: 0.00, 2th-best: 0.00, ave: 0.00, 2th-worst: 0.00, worst: 0.00 (of 5), reps/s: 495, ave rate: 64970669 |
| 49 | N: 262144, time: best: 0.00, 2th-best: 0.00, ave: 0.00, 2th-worst: 0.00, worst: 0.00 (of 5), reps/s: 239, ave rate: 62913360 |
| 50 | N: 524288, time: best: 0.01, 2th-best: 0.01, ave: 0.01, 2th-worst: 0.01, worst: 0.01 (of 5), reps/s: 121, ave rate: 63811569 |
| 51 | }}} |
| 52 | |
| 53 | The naive approach is slower than the !StringChain library, and the bigger the dataset the slower it goes. The !StringChain library is scalable (with regard to these benchmarks at least...). |
| 54 | |
| 55 | Okay how do you use it? It is very simple -- see [source:stringchain/stringchain.py] and let me know if that interface doesn't fit your use case. |
| 56 | |