#816 new enhancement

Add ping-all-servers button to welcome page

Reported by: zooko Owned by:
Priority: minor Milestone: eventually
Component: code-network Version: 1.5.0
Keywords: usability transparency ostrom statistics notifyOnDisconnect Cc:
Launchpad Bug:

Description (last modified by daira)

#653 was a long drawn out investigation that concluded that there is probably (but not certainly) a bug in foolscap in which notifyOnDisconnect() doesn't get triggered sometimes when it is supposed to. Fixing (and writing automated tests for) notifyOnDisconnect() is quite tricky. Also, it can never be 100% correct because of the problems of the inherent unreliability of communications and the limitations of the speed of light and so on. My personal prejudice as someone who has long studied secure and fault-tolerant networked applications is that you should really avoid relying on such a service -- a service that attempts to tell you when a remote object has switched from "likely to respond in a timely way to your next request" to "unlikely to respond in a timely way to your next request", and instead design your system so that it works correctly and as efficiently as it can regardless of the pattern of connections-and-disconnections of the underlying comms subsystems. (Hm, I guess this is an instance of the general idiom of "Don't check if it is likely to work and then try and then handle failure, instead just try and then handle failure.")

Now, Tahoe-LAFS already does it this way! For the most part. There are a few places where we invoke notifyOnDisconnect(), but removing most of them would not diminish the functionality of Tahoe-LAFS. One thing that would diminish its functionality is as Brian wrote on #653:

”""

  • the welcome-page status display would be unable to show "Connected / Not Connected" status for each known server. Instead, it could say "Last Connection Established At / Not Connected". Basically we'd know when the connection was established, and (with extra code) we could know when we last successfully used the connection. And when we tried to use the connection and found it down, we could mark the connection as down until we'd restablished it. But we wouldn't notice the actual event of connection loss (or the resulting period of not-being-connected) until we actually tried to use it. So we couldn't claim to be "connected", we could merely claim that we *had* connected at some point, and that we haven't noticed becoming disconnected yet (but aren't trying very hard to notice).
  • the share-allocation algorithm wouldn't learn about disconnected servers until it tried to send a message to them (this would fail quickly, but still not synchronously), but allocates share numbers ahead of time for each batch of requests. This could wind up with shares placed 0,1,3,4,2 instead of 0,1,2,3,4

The first problem would be annoying, so I think we're going to leave tahoe alone for now. I'll add a note to the foolscap docs to warn users about the notifyOnDisconnect bug, and encourage people to not rely upon it in replacement-connection -likely environments.

"""

Since he wrote that, I realized that it would be cool if the welcome-page had a "ping all servers" button which then changed their statuses to indicate whether they responded to the ping or not (and how long it took). This would, in my opinion, be more reliable and more informative than the current "connected/not-connected" welcome-page.

To close this ticket, make sure you have Brian's approval first, then add a "ping all servers" feature to the welcome page, then remove all uses of notifyOnDisconnect() from Tahoe-LAFS.

Change History (9)

comment:1 Changed at 2009-10-21T22:36:31Z by warner

I like the ping-all-servers button on the welcome page. I'm happy with not using notifyOnDisconnect() to update the welcome-page information in a timely manner (if people want timely information, they can push the button and reload). It might be nice for the welcome page to show "waiting for ping response.." while a ping is in flight, but on the other hand that might also be uglier and unnecessarily complicated.

I haven't yet decided about removing the notifyOnDisconnect() calls which provide share-allocation with more-timely information about how to allocate share numbers. I think I want peer-selection to have reasonably timely information about which servers are likely to be available and which ones are not, to improve the chances that the shares get allocated in order. (in addition to providing useful forensic data later, it also marginally improves performance of download, because the downloader is more likely to get "primary" shares sooner).

comment:2 Changed at 2009-11-03T03:42:10Z by davidsarah

  • Summary changed from don't rely on notifyOnDisconnect() to Add ping-all-servers button to welcome page

comment:3 Changed at 2010-01-16T01:24:29Z by davidsarah

  • Keywords usability added

comment:4 Changed at 2010-02-23T16:29:28Z by zooko

If you like this ticket you might also like #311 (add "last-heard-from" timestamp to welcome page).

comment:5 Changed at 2011-04-27T16:49:45Z by zooko

  • Keywords transparency ostrom statistics added

comment:6 follow-up: Changed at 2013-05-19T14:05:44Z by daira

  • Description modified (diff)

Removing uses of notifyOnDisconnect() in share allocation (if we want to do that) should be a separate ticket.

Last edited at 2013-05-19T14:06:22Z by daira (previous) (diff)

comment:7 Changed at 2013-05-19T14:07:30Z by daira

  • Milestone changed from undecided to eventually

comment:8 in reply to: ↑ 6 Changed at 2013-05-22T23:40:28Z by zooko

Replying to daira:

Removing uses of notifyOnDisconnect() in share allocation (if we want to do that) should be a separate ticket.

Good point. Opened #1975.

comment:9 Changed at 2013-06-25T16:17:32Z by zooko

  • Keywords notifyOnDisconnect added
Note: See TracTickets for help on using tickets.