[tahoe-dev] Scientific paper: Experimenting with SHA-3 candidates in Tahoe-LAFS

Pål Ruud paal.ruud at ntnu.no
Tue Feb 22 01:41:45 PST 2011

Hello, all!

I was asked by Zooko to post the results from the project done by my
colleague and I this autumn, which was trying out the (then) 14
candidates of the NIST SHA-3 competition with Tahoe-LAFS.

Unfortunately, the project did not end up with code submitted to the
Tahoe repository. We wanted to create a way to easily change underlying
hash algorithm, maybe in the form of new capabilities, but we were
limited by time as we had to write Python bindings for a numerous amount
of different implementations for the SHA-3 candidates.

The resulting paper [0] is approx. 60 pages long, so I will try to sum
up what I think is relevant:
- From «Abstract»:
  In general, four of the candidates perform better than the currently
  used SHA-256 - BLAKE, BlueMidnightWish, Shabal and Simd. By utilizing the
  performance improvements given by SHA-3, we show that Tahoe-LAFS could
  currently save up to 12.36% on upload operations by changing to the best
  performing candidate.

  This should increase when better optimizations are created, and the
  focus is on one, final algorithm, i.e. the «winner».

- From 3.3 «Modifications to the Tahoe-LAFS Code»:
  ... only immutable files are tested and measured in this paper.

>From 3.5 «Configuration of Test Environment»: 
  Four machines was used as a testing grid, each containing an Intel(R)
  Core(TM)2 Duo E8300 CPU and 4GB of RAM. The 32-bit version of Ubuntu
  Linux Server Edition 10.04 was the operating system of choice.

  Tahoe-LAFS was set up to 2/3/3 (shares needed/happy/available).

- Section 3.4 «Measuring Performance»:
  Describes how and where we timed the different calls, i.e. within
  util.hashutil and with Bash scripts.

- Section 4.4 «Measurements of internal Tahoe hashing sizes»:
  Displays the distribution of data chunk sizes while uploading the test
  vectors. The test vectors are listed in Section 3.5.1.

- Tables 4.4 to 4.13: (Probably most useful.)
  Lists total time, time spent hashing, fraction (time spent hashing /
  total time) for each of the test vectors for both upload and download,
  for each of the different SHA-3 candidates. These numbers are also
  illustrated in the following graphs.

We hope that someone finds this small contribution useful somehow.

[0] http://folk.ntnu.no/palru/project.pdf

Best regards,
Pål Ruud and Eirik Haver,
Department of Telematics,
Norwegian University of Science and Technology

More information about the tahoe-dev mailing list