[tahoe-dev] Tahoe as glue filesystem
Valentino Volonghi
dialtone at gmail.com
Mon Jul 7 12:23:13 PDT 2008
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Jul 7, 2008, at 11:24 AM, Brian Warner wrote:
>> Man, it was great to hear you ask this question. The possibility of
>> this for use in small, distributed, diskless devices is what most
>> recently prompted me to take another look at tahoe. Of course I'm
>> staying for the party anyway, but I am *deeply* interested in a
>> memory- only option for tahoe.
>
> Hm, I suppose it depends upon how much storage you have to work
> with. RAM or
> flash? Swapping out the allmydata.storage module for something that
> keeps all
> shares in a dictionary instead of in disk would be pretty easy.. we
> actually
> have some test cases that do just that (to test upload/download code
> in
> isolation from the storage code). But I think it would be hard to
> achieve
> good reliability if your share integrity depends upon continuous
> power,
> especially if the nodes are small (and likely to be portable or
> battery-powered).
Well the basic idea is to avoid going to disk when data is temporary or
short lived. And reusing the fail-over attributes of tahoe would make
the
users of tahoe careless of the way you store data.
Basically, since my usecase is a mapreduce, I'm pretty happy with
uploading
tons of logfiles to tahoe on a cluster of servers, since it's mostly a
write-once-read-all-the-other-times kind of thing.
Then all the computation comes into play and there the biggest thing
is being
able to talk to the other nodes in the fastest way possible but
without targeting
any node specifically (so it's pull rather than push). Which means
that basically
I'd put the results of a mapper into a known directory in tahoe and
the same for
the results of a reducer and so on.
All these intermediate results are of relative little importance and
if the system
goes completely down it's no big deal because I can start again from
scratch.
But it's really useful if the system resists partial failures during
normal work
(like it does with the other properties from tahoe).
In a way tahoe is really much like GFS (google filesystem). It's not
uncommon
to have processing servers with 16GB+ of RAM and it would boost
performance
quite a lot IMHO.
- --
Valentino Volonghi aka Dialtone
Now running MacOS X 10.5
Home Page: http://www.twisted.it
http://www.adroll.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Darwin)
iEYEARECAAYFAkhybSEACgkQ9Llz28widGXkhgCdHkXnxWZiROdaOKZQVwYRdKHa
ytIAoMV11VbmS//h573Ej72v75IcbXyP
=eEJe
-----END PGP SIGNATURE-----
More information about the tahoe-dev
mailing list