#1764 new defect

tahoe webapi gives HTTP 410 Gone for files that may actually come back — at Version 5

Reported by: ChosenOne Owned by: ChosenOne
Priority: normal Milestone: soon
Component: code-frontend-web Version: 1.9.1
Keywords: http standards test-needed Cc:
Launchpad Bug:

Description (last modified by daira)

From RFC 2616 about HTTP 410 Gone:

The requested resource is no longer available at the server and no forwarding address is known. This condition is expected to be considered permanent.

This response is cacheable unless indicated otherwise

the resource is intentionally unavailable and that the server owners desire that remote links to that resource be removed.

A few things are wrong about that: If the gateway could not find enough shares due to a current lack of servers, the error is in fact temporary and links to that resource may become valid again.

tahoe should instead return a 404, i.e. http.NOT_FOUND instead of http.GONE

Change History (6)

Changed at 2012-06-10T18:52:26Z by ChosenOne


Content-Disposition: form-data; name="replace"

on

comment:1 Changed at 2012-06-11T04:01:49Z by davidsarah

  • Component changed from unknown to code-frontend-web
  • Description modified (diff)
  • Keywords http standards added
  • Milestone changed from undecided to 1.10.0
  • Summary changed from tahoe webapi gives HTTP 401 Gone for files that may actually come back to tahoe webapi gives HTTP 410 Gone for files that may actually come back

comment:2 Changed at 2012-11-22T01:38:34Z by davidsarah

  • Keywords test-needed added
  • Owner changed from davidsarah to ChosenOne

If that patch doesn't break some tests, the test coverage was incomplete! :-) +1 on changing this, though.

comment:3 Changed at 2013-04-04T16:24:39Z by daira

  • Milestone changed from 1.10.0 to 1.11.0

comment:4 Changed at 2013-04-25T20:30:57Z by leif

Would 504 Gateway Timeout not be more appropriate than 404 Not Found?

From the RFC:

The server, while acting as a gateway or proxy, did not receive a timely response from the upstream server specified by the URI (e.g. HTTP, FTP, LDAP) or some other auxiliary server (e.g. DNS) it needed to access in attempting to complete the request.

comment:5 Changed at 2013-05-30T19:23:58Z by daira

  • Description modified (diff)

kpreid filed a duplicate #1993:

NotEnoughSharesError, NoSharesError, and UnrecoverableFileError, at least, are being reported using HTTP status code 410 Gone, which is a severe misuse of the code, as 410 means that the resource is known to be forevermore unavailable. Per RFC 2616 section 10.4.11:

The requested resource is no longer available at the server and no forwarding address is known. This condition is expected to be considered permanent. Clients with link editing capabilities SHOULD delete references to the Request-URI after user approval. If the server does not know, or has no facility to determine, whether or not the condition is permanent, the status code 404 (Not Found) SHOULD be used instead.

All of these errors indicate that the gateway is currently unable to fulfill the request (as any of them could result from temporary partition in the grid), not permanent deletion. 410 would be appropriate if, for example, a mutable file were put into a revoked, “no content and cannot be written to further”, state, but not for anything less drastic. (Tahoe is unusual in having even the architectural possibility of having enough confidence to correctly answer 410!)

The most appropriate response codes would be, I think, 404 for NoSharesError (because the grid has no knowledge of the file) and 503 for NotEnoughSharesError (because the grid knows the file exists but cannot be served). UnrecoverableFileError appears to be a conflation of the two in the case of mutable files, and so I see no good answer there but to introduce a distinction between the two cases.

Regardless, 410 should not be used in any of these cases.

I noticed this via https://tahoe-lafs.org/pipermail/tahoe-dev/2013-May/008313.html .

warner wrote:

Sounds good to me. Note that NoSharesError could be interpreted as an even-less-healthy version of NotEnoughSharesError, where there are so few shares that we couldn't find even a single one. So there might be an argument for reporting 503 in both cases.

If 410 means "it will never exist", does 404 mean "it might come back someday"? Also, does 410 imply anything about whether or not it used to exist? Are there any normal-web-server situations that would correctly produce a 410?

kpreid answered, in part:

I agree that 503 is not-wrong, but it is commonly understood that 404 can result from servers being temporarily broken; I think it is more valuable that to have the property that any bogus URL yields a 404.

I (Daira) agree.

If your grid is so flaky that you can lose all shares of a file, that's another problem entirely. (Actually: what if the gateway is not connected to enough storage servers that enough (properly spread) shares could not possibly be found? That would be an appropriate time for a 503 if no shares are found, since it is likely that the answer will be different when the grid is in better condition.)

That would be part of #719.

Note: See TracTickets for help on using tickets.