<div class="gmail_quote">On Mon, Jun 27, 2011 at 1:37 AM, Nathan Eisenberg <span dir="ltr"><<a href="mailto:nathan@atlasnetworks.us">nathan@atlasnetworks.us</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

<div lang="EN-US" link="blue" vlink="purple"><div><p class="MsoNormal">The grid will never grow to more than 10 nodes, as we’ll just create additional grids after that (this is primary to prevent an A-M-D type failure where dircaps are spread over many servers).  If more space is required, we’ll expand the 10 nodes, rather than add more nodes.</p>

</div></div></blockquote><div><br></div><div>Another solution is to add more nodes but to increase N and K.  At the extreme, if you keep N set to the number of nodes in the system (and could get all files to be updated to have N shares), then the allmydata problem couldn't happen because all files in the system would live or die together.  You can't actually do that, but if you keep increasing N and K you can ensure that for files added later it would take a very large number of simultaneous failures to make files unavailable.  Allmydata used N=10,K=3, so any time 7 servers (out of hundreds!) were down it was likely to knock some files out.  If those were dircaps, the problem was particularly nasty.</div>

<div><br></div><div>However, if you just increase N and K you'll have a problem that the dircaps -- the most important files, from a reliability perspective -- will still have their original values.  As you update the directories, I believe that the shares will shift around the nodes, but they won't get any more shares... unless you use immutable directories which are obviously created with each update. Can that be done via the sftp interface?</div>

<div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div lang="EN-US" link="blue" vlink="purple"><div><p class="MsoNormal">Since nodes will never leave the grid permanently (only brief windows for reboots and such), I was thinking that simple replication (k=1, n=2, happy=2) would be sufficient.  The backing disks are in RAID-1, to prevent a disk failure from requiring a file repair.</p>

</div></div></blockquote><div><br></div><div>Just my opinion, but I think this approach ignores the strengths of Tahoe.  Ignoring the RAID-1 for the moment, and supposing 98% reliability of the servers (probably conservative), that gives you a 0.04% probability of file loss at the cost of a 100% expansion factor.  If, instead, you were to set K=5, N=10 (or perhaps K=4,N=8, to avoid write failures when a couple of machines are down), for the same expansion factor you get orders of magnitude lower probability of file loss.  And I'd also consider skipping the RAID-1 and instead running two Tahoe servers on each machine, one per disk.</div>

</div><br>-- <br>Shawn<br>