[tahoe-dev] [tahoe-lafs] #932: benchmark Tahoe-LAFS compared to nosql dbs

tahoe-lafs trac at allmydata.org
Fri Jan 29 17:43:33 PST 2010


#932: benchmark Tahoe-LAFS compared to nosql dbs
-------------------------------------------+--------------------------------
 Reporter:  zooko                          |           Owner:  somebody 
     Type:  enhancement                    |          Status:  new      
 Priority:  major                          |       Milestone:  undecided
Component:  dev-infrastructure             |         Version:  1.5.0    
 Keywords:  scalability performance large  |   Launchpad_bug:           
-------------------------------------------+--------------------------------
 I'm curious how Tahoe-LAFS performs compared to nosql databases on the
 nosqlish loads that those users care about. Aaron Cordova did some
 benchmarks of Tahoe-LAFS vs. HDFS as the storage backend for Hadoop and
 reported in his !HadoopWorld presentation that they performed about the
 same for the map-reduce computation (which is a read-intensive workload):
 http://www.slideshare.net/cloudera/hw09-map-reduce-over-tahoe-a-least-
 authority-encrypted-distributed-filesystem

 Recently a scientist from Yahoo posted about his benchmarks of various
 nosql systems:

 http://mail-archives.apache.org/mod_mbox/incubator-cassandra-
 user/201001.mbox/%3cC2D6929236FAC846B7A4FE1EC39910C64F27B52F25 at SP1-EX07VS01.ds.corp.yahoo.com%3e

 He says that his benchmarking code will be open-sourced soon pending
 approval from Yahoo's legal department. Maybe we could contribute patches
 that make Tahoe-LAFS one of the systems that his benchmark system can
 measure.

 N.B. not to get anyone's hopes up, I would expect Tahoe-LAFS to perform
 very badly on those workloads! They typically want to assign values to
 user-specified keys, which we don't have a native implementation of and
 which we would have to simulate somehow, such as by letting the user-
 chosen keys be the childnames in a mutable directory. So I would expect
 Tahoe-LAFS to be pretty much off the charts for bad performance on those
 workloads. But, I might be pleasantly surprised. And also: "What gets
 measured gets improved!" :-)

-- 
Ticket URL: <http://allmydata.org/trac/tahoe/ticket/932>
tahoe-lafs <http://allmydata.org>
secure decentralized file storage grid


More information about the tahoe-dev mailing list