[tahoe-dev] Tahoe benchmarking data

Brian Warner warner at lothar.com
Mon Jul 26 23:09:54 UTC 2010


On 7/26/10 2:29 PM, Chris Palmer wrote:
> Brian Warner writes:
> 
>> Yup. I suspect that your large files are running into python's performance
>> limits: the best way to speed those up will be to move our transport to
>> something with less overhead (signed HTTP is our current idea, ticket
>> #510), then to start looking at what pieces can be rewritten in a faster
>> language. The obvious parts are already in C or C++.
> 
> I don't understand this. What limit of Python's is responsible?

I'm referring to the number of CPU cycles it takes for Python to
encrypt, encode, and push a byte over the wire, when you take into
account all the extra protocol layers that are involved. Many of the
bulk operations are already being done by C/C++ (zfec, pycryptopp,
openssl), but some of them are being done by pure-python code (Foolscap
copying bytes into a transmit buffer with regular string copies,
received-message parsing on the far end, copying bytes from the input
message to the share file on the storage server, etc). And there are
lots of places with constant-time overhead on each
packet/block/segment/message: even if we can do erasure-coding of large
segments at 40MBps,
the setup time might cost enough to slow down small segments to well
below that.

SSH can move files across the wire faster than a pair of python programs
that are parsing each packet (say that five times fast! :). Reducing the
amount of code involved on each end, or compiling that code into a
faster language, will speed that part up.

cheers,
 -Brian


More information about the tahoe-dev mailing list