Context Navigation

← Previous Ticket
Next Ticket →

Opened at 2009-05-31T15:42:35Z

Last modified at 2017-09-19T17:20:49Z

#719 new defect

Making requests too soon after startup can fail

Reported by:	bewst	Owned by:
Priority:	major	Milestone:	soon
Component:	code-frontend	Version:	1.4.1
Keywords:	download upload check repair usability error wui availability reliability	Cc:
Launchpad Bug:

Description (last modified by daira)

$ tahoe start
STARTING /export/home/dave/.tahoe
client node probably started
$ tahoe ls
Error during GET: 410 Gone UnrecoverableFileError: the directory (or mutable file) could not be retrieved, because there were insufficient good shares. This might indicate that no servers were connected, insufficient servers were connected, the URI was corrupt, or that shares have been lost due to server departure, hard drive failure, or disk corruption. You should perform a filecheck on this object to learn more.
$ tahoe ls
Welcome_to_Allmydata.pdf
_My Shared Files_
_Recycle bin_
bak
c++std2003.pdf
$

Change History (10)

comment:1 Changed at 2009-05-31T21:11:52Z by warner

Component changed from unknown to code-network
Owner nobody deleted

This is an issue with hidden depths.. how should the client node know that it has connected to every server that it's ever going to need?

But it should be easy to improve the situation somewhat. To start with, there should be some internal function that keeps track of "progress towards full connection":

have we connected to the introducer? how long ago did we connect? do we even have an introducer.furl?
how many storage servers have we been told about? how many are connected? how many are left? how long have we been trying to connect to them?

Then, when a directory retrieve or a file download fails due to insufficient shares, this function could provide additional human-useful data, like saying "we couldn't retrieve that directory right now, but since it looks like we've only been connected to the introducer for two seconds, maybe we just don't know about enough servers yet. You should try again in ten seconds.".

I'm not sure how to deliver that extra information. Specifically, the tahoe node should not try to guess whether this is a transient failure or a permanent one: we don't want to resort to heuristics or fixed timeouts. So this extra data is advisory and should be interpreted by a human rather than a piece of code.

So from the webapi point of view, 410 still seems like the right response code, but maybe we can add the text to the response body, and make sure that the CLI tools will deliver this body to stderr.

We have similar issues in a browser. I don't know when browsers will show the response body for things like 410 GONE, but maybe we can use the same technique.

comment:2 Changed at 2010-04-04T17:10:48Z by davidsarah

Keywords usability error wui added
Milestone changed from undecided to 1.7.0

This issue also affects the WUI. Some browsers (in particular IE) will hide response bodies for HTTP errors by default, but that doesn't mean that isn't the right place to put human-readable info about the error; the HTTP spec specifically says that browsers SHOULD display the entity body for errors (see the end of RFC 2616 section 6.1.1).

Last edited at 2012-04-23T18:42:12Z by davidsarah (previous) (diff)