[tahoe-dev] Tahoe benchmarking data

Sun Aug 1 05:27:58 UTC 2010

Dear Kyle (et alia):

I've thought some more about this and talked about it a bit with my
wife, Amber, and I have a few more comments.

* I realized that since your small files were themselves 64 KiB each,
then any segment size >= 64 KiB would have the same effect as any
other segment size >= 64 KiB. So the different results you got for
different runs with segment sizes >= 64 KiB for small files just
demonstrate the variability inherent in the system.

* Oh! And likewise with shares.needed = 1 then any pipeline size >=
file size should be the same. Reformatting your results for small
files to coalesce those two categories looks like this:

Wired LAN, small files, in seconds:
                     pipeline_size
              10KiB  50000 >=192KiB
segment_size
       8KiB  265.79
      16KiB                  234.98
      32KiB                  144.92
    >=64KiB          68.59   218.85/223.25/225.54/228.16/231.61

Wireless, small files, in seconds:
                     pipeline_size
              10KiB  50000 >=192KiB
segment_size
       8KiB  312.90
      16KiB                  341.78
      32KiB                  193.91
    >=64KiB         101.76    95.98/96.68/98.27/98.40/147.71

* I wonder if other unrelated processes started running during your
benchmarks and used up your network. Neighbors borrowing your wifi?
Buildbot jobs? Hm, you sent us your benchmark report at 2010-06-25
20:02:20Z. According to the buildbot, your OpenBSD machine was serving
up builds earlier that day and the previous day:

http://tahoe-lafs.org/buildbot/waterfall?show_events=true&branch=&builder=Kyle+OpenBSD-4.6+amd64&reload=none&last_time=1277496140

(Note: the timestamps on the left are in UTC-7 although they are not
marked as such. Bad buildbot.)

So if some of your test runs overlapped with those builds, then builds
of Tahoe-LAFS itself could have been interfering with your timing
results! :-)

Maybe next time you should turn off networking to the outside world
during your measurements.

* Do you know about the "Recent Uploads and Downloads" and the reports
therein of the timings of all uploads and downloads? Those would be
very good to capture and include with benchmark results like these.

Regards,

Zooko