[tahoe-dev] timeouts

Sam Mason sam at samason.me.uk
Mon Aug 10 09:34:07 PDT 2009


On Sun, Aug 09, 2009 at 02:25:34PM -0600, Zooko Wilcox-O'Hearn wrote:
> On Thursday,2009-08-06, at 19:58 , Sam Mason wrote:
> > My only initial concern is the apparent lack of timeouts when  
> > creating/uploading things.
> 
> Nope, this is a known issue.  It happens a lot on Test Grid, where  
> there are nodes which are offering storage service but which  
> disconnect abruptly without saying goodbye or which take ages  
> (minutes) to respond to your requests.  I encounter it frequently  
> because my blog is stored on Test Grid.  It doesn't happen very often  
> grids with higher-quality storage servers.  Here are some probably- 
> relevant tickets: #193, #253, #287, #436, #521, #573.

The fixes to those look as though they'd be scattered across the code
somewhat.  Just to dip into the code (so to speak) if I were just to
fix my immediate problem what would be a good fix?  The others are
mainly about download or the initial selection of servers so seem to
be a different, though related, problem.

There seem to be a couple of way of fixing this, the easiest is to
tell the user the file has been uploaded when some number of shares
(somewhere between N and K) have been successfully sent to other
servers.  With the remaining shares would continue to be sent in the
background and normal repair mechanisms coming to the rescue if the
failing servers never made it back into the network.

A better fix would seem to be to send the failing shares off to other
servers, but if I interpreted the protocol correctly each server knows
which other servers contain shares and so you'd need some way of telling
them that things have moved.

Comments?

-- 
  Sam  http://samason.me.uk/


More information about the tahoe-dev mailing list