On Tue, May 24, 2011 at 9:55 AM, Neil Aggarwal <span dir="ltr"><<a href="mailto:neil@jammconsulting.com">neil@jammconsulting.com</a>></span> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
Hello all:<br>
<br>
I have been reading the Tahoe docs and am a bit confused.<br>
<br>
I am looking for a distributed real-time filesystem.<br></blockquote><div><br></div><div>Tahoe is a distributed file system. Whether or not it is "real-time" is somewhat debatable. In particular, if you may have simultaneous writes to the same mutable file (directories are mutable files), then Tahoe won't work. Some extra-Tahoe mechanism for serializing updates is required.</div>
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">Does Tahoe allow my to access it just like a regular<br>
filesystem? For example, do I cd to a directory and<br>
list the files?<br></blockquote><div><br></div><div>There are some FUSE modules that provide access to Tahoe through a standard file system, but their quality is not high and there are some limitations.</div><div> </div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">My network looks like this:<br>
<br>
Colo 1 Colo 2<br>
Server 1 Server 5<br>
Server 2 Server 6<br>
Server 3 Server 7<br>
Server 4 Server 8<br>
<br>
Colo 1 and 2 are separated by a large distance.<br>
The servers in each colo are on the same network.<br>
All servers will be running CentOS.<br>
<br>
Here is what I need:<br>
1. Any server may create or modify a file<br>
and changes should be immediately available<br>
to the others.<br></blockquote><div><br></div><div>As long as there's no chance of two servers trying simultaneously to modify the same file or the same directory, Tahoe will do that.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
2. I need to have no single points of failure.<br></blockquote><div><br></div><div>Tahoe does this part very well. This is one of Tahoe's main goals... not only will there be no single point of failure, but the data will be spread across the servers so that several of them could fail simultaneously without affecting the availability of the data to the others.</div>
<div><br></div><div>Assuming you can work around the simultaneous-update problem, and assuming that you can deal with the FUSE implementation imperfections (or with using a different way to read/write data), and assuming that you're not looking for extreme performance, then Tahoe will work. I would suggest setting your N (the number of shares of each file to create) to 8, the total number of servers you have, and K (the number of shares needed to be able to retrieve a file) to something less than 4. If K > 4 then losing the connection between the data centers will mean the data is unavailable until the connection is restored.</div>
<div><br></div></div>-- <br>Shawn<br>