[tahoe-dev] Tahoe as glue filesystem

Valentino Volonghi dialtone at gmail.com
Mon Jul 7 12:23:13 PDT 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Jul 7, 2008, at 11:24 AM, Brian Warner wrote:

>> Man, it was great to hear you ask this question. The possibility of
>> this for use in small, distributed, diskless devices is what most
>> recently prompted me to take another look at tahoe. Of course I'm
>> staying for the party anyway, but I am *deeply* interested in a
>> memory- only option for tahoe.
>
> Hm, I suppose it depends upon how much storage you have to work  
> with. RAM or
> flash? Swapping out the allmydata.storage module for something that  
> keeps all
> shares in a dictionary instead of in disk would be pretty easy.. we  
> actually
> have some test cases that do just that (to test upload/download code  
> in
> isolation from the storage code). But I think it would be hard to  
> achieve
> good reliability if your share integrity depends upon continuous  
> power,
> especially if the nodes are small (and likely to be portable or
> battery-powered).

Well the basic idea is to avoid going to disk when data is temporary or
short lived. And reusing the fail-over attributes of tahoe would make  
the
users of tahoe careless of the way you store data.

Basically, since my usecase is a mapreduce, I'm pretty happy with  
uploading
tons of logfiles to tahoe on a cluster of servers, since it's mostly a
write-once-read-all-the-other-times kind of thing.

Then all the computation comes into play and there the biggest thing  
is being
able to talk to the other nodes in the fastest way possible but  
without targeting
any node specifically (so it's pull rather than push). Which means  
that basically
I'd put the results of a mapper into a known directory in tahoe and  
the same for
the results of a reducer and so on.

All these intermediate results are of relative little importance and  
if the system
goes completely down it's no big deal because I can start again from  
scratch.
But it's really useful if the system resists partial failures during  
normal work
(like it does with the other properties from tahoe).

In a way tahoe is really much like GFS (google filesystem). It's not  
uncommon
to have processing servers with 16GB+ of RAM and it would boost  
performance
quite a lot IMHO.

- --
Valentino Volonghi aka Dialtone
Now running MacOS X 10.5
Home Page: http://www.twisted.it
http://www.adroll.com

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Darwin)

iEYEARECAAYFAkhybSEACgkQ9Llz28widGXkhgCdHkXnxWZiROdaOKZQVwYRdKHa
ytIAoMV11VbmS//h573Ej72v75IcbXyP
=eEJe
-----END PGP SIGNATURE-----


More information about the tahoe-dev mailing list