Changes between Initial Version and Version 1 of Performance/Sep2011


Ignore:
Timestamp:
2011-09-23T06:17:45Z (13 years ago)
Author:
warner
Comment:

add report on performance tests

Legend:

Unmodified
Added
Removed
Modified
  • Performance/Sep2011

    v1 v1  
     1
     2== The atlasperf1 grid ==
     3
     4All these performance tests are run on a four-machine grid, using hardware
     5generously provided by Atlas Networks. Each machine is a dual-core 3GHz P4,
     6connected with gigabit(?) ethernet links. Three machines are servers, running
     7two servers each (six storage servers in all), each on a separate disk. The
     8fourth machine is a client. The storage servers are running a variety of
     9versions.
     10
     11== Versions ==
     12
     13These tests were conducted from 19-Sep-2011 to 22-Sep-2011, against Tahoe
     14versions 1.7.1, 1.8.2, and trunk (circa 19-Sep-2011, about [changeset:8e69b94588c1c0e7]).
     15
     16== Overall Speed ==
     17
     18With the default encoding (k=3), trunk MDMF downloads on this grid run at
     194.0MBps. Trunk CHK downloads run at 2.6MBps. (For historical comparison, the
     20old CHK downloader from 1.7.1 runs at 4.4MBps). CHK performance drops
     21significantly with larger k.
     22
     23== MDMF (trunk) ==
     24
     25MDMF is fast! Trunk downloads 1MB/10MB/100MB MDMF files at around 4MBps. The
     26download speed seems unaffected by k (from 1 to 60). Partial reads take the
     27expected amount of time: O(data_read), slightly quantized near the 128KiB
     28segment size.
     29
     30* MDMF read versus k, 100MB attachment:MDMF-100MB-vs-k.png (timing5.out)
     31* MDMF partial reads, 100MB attachment:MDMF-100MB-partial.png (timing6.out)
     32* MDMF partial reads, 1MB attachment:MDMF-1MB-partial.png (timing6.out)
     33
     34
     35== CHK (trunk) ==
     36
     37The new-downloader (introduced in 1.8.0) does not saturate these fast
     38connections. Compared to the old-downloader (in 1.7.1), downloads tend to be
     39about 3x slower. (note that this delay is probably completely hidden on slow
     40WAN links, and it's only the fast LAN connections of the atlasperf1 grid that
     41exposes the delay). In addition, both old and new downloaders suffer from a
     42linear slowdown as k increases. On the new-downloader, k=60 takes roughly 3x
     43more time than k=15. Trunk contains a fix for #1268 that might improve speeds
     44by 5% compared to 1.8.2. Partial reads take the expected amount of time,
     45although the segsize-quantization was nowhere nearly as clear as with MDMF.
     46
     47
     48* CHK (1.7.1/1.8.2/trunk) read versus k, 1MB attachment:CHK-1MB-vs-k.png (t4/t/t3)
     49* CHK (1.7.1/1.8.2/trunk) read versus k, 100MB attachment:CHK-100MB-vs-k.png (t4/t/t3)
     50
     51* CHK (1.8.2) read versus segsize, 1MB attachment:CHK-1MB-vs-segsize.png (t2)
     52* CHK (1.8.2) read versus segsize, 100MB attachment:CHK-100MB-vs-segsize.png (t2)
     53
     54* CHK (trunk) partial reads, 1MB attachment:CHK-1MB-partial.png (t7)
     55* CHK (trunk) partial reads, k=3, 1MB attachment:CHK-1MB-k3-partial.png (t7)
     56* CHK (trunk) partial reads, 100MB attachment:CHK-100MB-partial.png (t7)
     57
     58
     59Likely problems include:
     60
     61* high k and default segsize=128KiB means tiny segments, like 2KB when k=60.
     62* lots of reads, lots of foolscap messages, and marshalling is probably slow
     63* disk seeks to gather hash nodes from all over the share
     64
     65Likely fixes include:
     66
     67* add a readv() API, to reduce the number of Foolscap messages in flight
     68* prefetch hash-tree nodes, by reading larger chunks of the tree at once.
     69  old-downloader cheated by reading the whole hash tree at once, violating
     70  the memory footprint goals (requires O(numsegments) memory), but probably
     71  tolerable unless the filesize is really large.
     72* encourage use of larger segsize for large files (at the expense of
     73  alacrity)
     74* use unencrypted HTTP for share reads
     75
     76readv() is the least-work/most-promising, since MDMF has readv() and achieves
     77high speeds. First step is to do whatever MDMF is doing
     78
     79== Future Tests ==
     80
     81* measure alacrity: ask for random single byte, measure elapsed time
     82* measure partial-read speeds for CHK
     83* measure SDMF/MDMF modification times
     84* measure upload times
     85* using existing data as a baseline, detect outliers in real-time during the
     86  benchmark run, and capture more information about them (their "Recent
     87  Uploads And Downloads" event timeline, for starters)
     88
     89== Additional Notes ==
     90
     91Some graphs were added to
     92http://tahoe-lafs.org/trac/tahoe-lafs/ticket/1264#comment:17 .
     93
     94Complete benchmark toolchain and data included in
     95attachment:benchmark.git.tar.gz