[tahoe-lafs-trac-stream] [Tahoe-LAFS] #2101: improve error messages from failed uploads

Thu Dec 11 23:24:52 UTC 2014

#2101: improve error messages from failed uploads
-------------------------+-------------------------------------------------
     Reporter:  zooko    |      Owner:  daira
         Type:  defect   |     Status:  new
     Priority:  normal   |  Milestone:  1.12.0
    Component:  code-    |    Version:  1.10.0
  peerselection          |   Keywords:  upload error servers-of-happiness
   Resolution:           |  transparency
Launchpad Bug:           |
-------------------------+-------------------------------------------------

Comment (by daira):

 #1941 was a duplicate. Its description was:

 > I heard that the volunteergrid2 project has shut down. The participants,
 in explaining why they gave up on it, said that they often got
 "unhappiness errors" when they tried to upload files, so therefore they
 never trusted the grid with their backups.
 >
 > There are two problems here that this ticket attempts to address:
 >
 > 1. They didn't trust the grid. Why? Not because the upload failed, but
 because they didn't know why the upload had failed. They interpreted this
 as evidence that Tahoe-LAFS was buggy or unreliable. If they had seen a
 clear, understandable explanation that said "This upload failed because
 you specified you required at least 15 servers, and of the 20 servers on
 your grid, 10 of them are currently unreachable.", then they would have
 continued to trust the Tahoe-LAFS software and they would have known what
 changes to make (to their grid or their happiness parameter) to get what
 they wanted. (Note that information was actually already in those
 "unhappiness errors", but they didn't read or understand it. See below.)
 >
 > 2. We (the tahoe-lafs developers) don't know why their uploads failed.
 Perhaps Tahoe-LAFS was harboring some previously-unknown bug. Perhaps too
 many of their servers were on flaky home DSL that timed-out most requests.
 Perhaps it was something else. We can't improve the software without a
 working feedback loop whereby we can learn the details of failures.
 >
 > This ticket is to make it so that when an upload fails, you can read an
 understandable story of what happened that led to the failure, specifying
 which servers your client tried to use and what each server did.
 >
 > Note that the basic information of how many servers were reachable,
 etc., is encoded into the error message that users currently see, but
 users do not read that error message, because it contains a Python
 traceback, so they just gloss over it. So this ticket is to make two
 changes to that:
 >
 > 1. Add more information. Not just the number of servers that failed, but
 which specific servers (identifiers, nicknames, IP addresses) and when.
 >
 > 2. Make it a human-oriented HTML page, not a Python traceback. Most
 users will not read anything that contains a Python traceback.

--
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/2101#comment:7>
Tahoe-LAFS <https://Tahoe-LAFS.org>
secure decentralized storage