#235 new enhancement

scale up to many nodes

Reported by: zooko Owned by:
Priority: major Milestone: undecided
Component: code-network Version: 0.7.0
Keywords: scalability memory performance openssl leastauthority Cc:
Launchpad Bug:

Description (last modified by exarkun)

I updated The UseCases Page to reflect that someone might want to run a managed Tahoe grid comprising one thousand nodes. (If each node has a single 1 TB hard drive, that's a 1 PB grid. Obviously there are lots of other options, such as each node having six 1 TB hard drives in a RAID-6 configuration, resulting in 4 usable TB per node or a 4 PB grid.)

Anyway, we expect that the current Tahoe grid would have problems handling more simultaneously connected nodes. One known problem is that pyOpenSSL uses almost 1 MB of RAM per SSL connection. (See also #11.)

This ticket can be closed when Tahoe is demonstrated to handle one thousand simultaneously connected nodes smoothly.

Change History (6)

comment:1 Changed at 2008-06-01T21:04:59Z by warner

  • Milestone changed from eventually to undecided

comment:2 Changed at 2009-12-13T03:02:47Z by davidsarah

  • Keywords memory performance openssl added; scaling removed

comment:3 Changed at 2010-01-03T02:56:00Z by davidsarah

From ticket:872#comment:16 :

I don't yet know how to scale tahoe up to millions of nodes, but I think it will be important to have a well-defined and stable place to find your shares (even if you have to do O(log(N)) queries to find them).

The nodes don't need to maintain connections to all other nodes, so that's not the scaling constraint. The obvious scaling constraint is the size of the location info for other nodes. When you get to the point where that information takes an unreasonable amount of memory, you can split the network into subgrids, with each subgrid having a set of supernodes (with should be the most available nodes in each subgrid). Then each node only needs to know the locations of all the supernodes, and each supernode only needs to know the locations of other nodes within its subgrid. This creates a small-world network in which any node is at most two hops from any other. So, you can scale to roughly the square of the number of nodes that would otherwise be feasible: use the permuted list algorithm to pick the supernodes for a given file, and have them use the algorithm again to route to the actual storage servers.

Note that this shouldn't be needed in order to scale to 1000 nodes -- the size of the location and public key info for 1000 nodes should easily be small enough to fit into memory. Do we need another ticket for scaling to grids with hundreds of thousands of nodes, or am I being too prematurely ambitious? :-)

comment:4 Changed at 2010-01-06T07:36:57Z by warner

There are a number of hurdles to scale up to lots of nodes. This ticket is sort of a reminder to enumerate some of them, or record known limitations and potential solutions.

The limitation alluded to in the summary is probably the first hurdle: Foolscap, at least, has been observed (as of the creation date of this ticket, some 2 years ago) to consume an unreasonable approx. 1MB of RAM per open connection. I seem to remember doing some analysis and deciding that pyOpenSSL was to blame, rather than foolscap, but that was a long time ago and the tests should be run again before putting too much energy into it. There's no good reason for it to use this much memory.. the connection state and buffers should really fit into a couple of kilobytes.

The next hurdle will be the current practice of maintaining open connections to all known storage servers. If we left the protocols alone, we could change this to open connections on-demand, but that would incur a significant per-file latency (for both upload and download), and of course things like file-check and mutable-file publish would become really really slow because both want to query lots of servers. So changing the peer-selection protocols would probably be necessary to effectively remove this limitation.

comment:5 Changed at 2010-01-16T00:10:07Z by davidsarah

If you like this ticket, you might like #444 (reduce number of active connections: connect on demand).

comment:6 Changed at 2020-01-17T23:54:04Z by exarkun

  • Description modified (diff)
  • Keywords leastauthority added
Note: See TracTickets for help on using tickets.