Performance – Tahoe-LAFS

Context Navigation

Version 35 (modified by zooko, at 2011-07-15T19:20:50Z) (diff)
add a few more performance notes

(See also copious notes and data about performance of older versions of Tahoe-LAFS, archived at Performance/Old.)

In late 2010 Kyle Markley did some benchmarking of what were then the release candidates for Tahoe-LAFS v1.8.0. This helped us catch two major performance regressions in Brian's New Downloader and helped make Tahoe-LAFS v1.8.0 into an excellent new release (see epic ticket #1170 for mind-numbing details). Kyle also contributed code for his benchmarking scripts (in Perl), but nobody to my knowledge has yet tried to re-use that script.

We also experimented with different segment sizes and immutable uploader pipeline depths, and the results tentatively confirmed that the current segment size (128 KiB) and immutable uploader pipeline depth (50,000 B) were better on both of Kyle's networks than any of the alternatives that Kyle tried.

Along the way Terrell Russell did some benchmarking and contributed a bash script which I used several times during the process:

http://tahoe-lafs.org/trac/tahoe-lafs/ticket/1170#comment:81

At about the same time Nathan Eisenberg of Atlas Networks did a couple of manual benchmarks:

Also François Deppierraz has run a few benchmarks. (Can't find a link to his results.)

Jeff Darcy benchmarked Tahoe-LAFS vs. his new CloudFS (based on Gluster) vs. encfs vs. ecryptfs vs. Gluster, using iozone: https://fedorahosted.org/pipermail/cloudfs-devel/2011-June/000097.html https://fedorahosted.org/pipermail/cloudfs-devel/2011-June/000099.html

Ticket #932 (benchmark Tahoe-LAFS compared to nosql dbs) is a ticket to run the standard "YCSB" benchmarks for nosql databases on Tahoe-LAFS.

What we really want, of course, is automated benchmarks that get executed at regularly scheduled intervals, or whenever a new patch is committed to revision control, or both. This would ideally run on some dedicated hardware or at least on some virtualized hardware which had a fairly consistent load of other tenants, so that the resulting measurements would not get too much noise from other people's behavior. You can see on Performance/Old that we used to have such an automated setup, including graphs of the resulting performance.

Attachments (1)

pipeline-sends.diff (1.5 KB) - added by warner at 2007-09-09T00:10:23Z. patch to pipeline the hash-sends during upload

Download all attachments as: .zip

Download in other formats:

Plain Text