[zfec-dev] technical report measuring erasure code implementations

Fri Aug 15 11:10:29 PDT 2008

Folks:

Catherine Schuman and James Plank have published a technical report
benchmarking several erasure code implementations including zfec.

http://www.cs.utk.edu/~plank/plank/papers/CS-08-625.html

zfec comes out looking really good!  Be sure and see the appendix
where they show the performance with a few other erasure coding
parameters.

In their terminology "k" stands for the number of data shares, which
is also call "K" in Tahoe and sometimes call the number of "primary
shares", and "m" stands for the number of check shares or what we
call "secondary shares" in Tahoe.  We don't assign a variable to
denote the number of check shares in Tahoe, but instead we let "M"
denote the number of all shares, both primary and secondary.

They tested 6-out-of-8, 14-out-of-16, 12-out-of-16, and 10-out-of-16.
Tahoe uses 3-out-of-10 currently.

I noticed one limitation in this technical report: someone needs to
benchmark zfec against the old fec library from Luigi Rizzo, sometimes
called vdmfec.  Currently some people compare their newfangled erasure
codes to zfec (e.g. this technical report), and other people compare
their newfangled erasure codes to Rizzo's lib (e.g. [1]), and so it is
hard to compare these to each other.

Also, most of the credit for zfec being so fast probably goes to Luigi
Rizzo and the contributors to his library, including Phil Karn, Robert
Morelos-Zaragoza, Hari Thirumoorthy, and Dan Rubenstein.

It might be useful to know if the changes I made to that code in order
to produce zfec actually helped performance at all.  I did do some
benchmarking myself at the time, but I didn't write the results down,
and my main motivation in inventing zfec wasn't to improve the
performance of Rizzo's library (which was already good), but to
simplify the source code, add the Python API, the command-line tool,
and so forth.

By the way, I did a cache-friendliness experiment similar to the one
they report in this paper, and I came to a similar preferred size!
(Of course I didn't make nice graphs like they did, nor write it down
so that I could later reference it.)

"do encoding within a fixed window of memory in order to be cache  
friendly"
http://allmydata.org/trac/zfec/changeset/213

"reorder the inner loop to be more cache-friendly"
http://allmydata.org/trac/zfec/changeset/214

"set STRIDE to 8192 after extensive experimentation on my PowerPC G4  
867 MHz (256 KB L2 cache)"
http://allmydata.org/trac/zfec/changeset/215

zfec encodes things in 8192-byte input strides for this reason.  I
always intended to change decode to do the same thing, but haven't
gotten around to it yet.

I also always intended to do a benchmark including cache and CPU
contention.  For example, a bash script like this:

# start 10 simultaneous benchmarks to compete for CPU and cache, then
# start one that we actually look at, then start a few more for more
# competition.

for L in 1 2 3 4 5 6 7 8 9 10 ; do
     python -OO ./bench/bench_zfec.py >/dev/null &
done ;
python -OO ./bench/bench_zfec.py &
for L in 1 2 3 4 5 ; do
     python -OO ./bench/bench_zfec.py >/dev/null &
done

Hm.  Now that I've written that script maybe I'll go ahead and run
it...  Stay tuned...

The tech report says that work on zfec began in 2004, but that's not
true -- the first checkins in revision control history show January,
2007.

Regards,

Zooko

[1] http://planete-bcast.inrialpes.fr/breve.php3?id_breve=10