#1367 new defect

tolerance for broken TCP connections due to incorrect/restrictive firewalls

Reported by: gdt Owned by:
Priority: major Milestone: undecided
Component: code-network Version: 1.8.2
Keywords: availability firewall reliability Cc:
Launchpad Bug:

Description

I've run a server and seen problems due to an overzealous firewall, where TCP connections are impaired after a short time. Clients try to talk to the server, and I see queued bytes that are never acked, and it then seems that each access takes 4m or 8m to time out and finish.

Somehow, tahoe should refrain from waiting a long time repeatedly for systems that history predicts will not answer, and operations that can be completed reasonably quickly with the subset of responding servers should finish reasonably quickly.

To reproduce without my firewall, add debug code to the server to discard (instead of processing) data on all TCP connections older than 3 minutes. Then bring a storage node with this impairment up on a grid.

Change History (1)

comment:1 Changed at 2011-02-23T02:40:20Z by davidsarah

  • Keywords reliability added
Note: See TracTickets for help on using tickets.