[tahoe-dev] [tahoe-lafs] #653: introducer client: connection count is wrong, !VersionedRemoteReference needs EQ
tahoe-lafs
trac at allmydata.org
Fri Jul 17 07:11:49 PDT 2009
#653: introducer client: connection count is wrong, !VersionedRemoteReference
needs EQ
--------------------------+-------------------------------------------------
Reporter: warner | Owner: warner
Type: defect | Status: assigned
Priority: major | Milestone: 1.5.0
Component: code-network | Version: 1.3.0
Keywords: | Launchpad_bug:
--------------------------+-------------------------------------------------
Comment(by zooko):
Oookay, I read the relevant source code and the miscounting of number of
connected storage servers was fixed between [3897] (the version that
exhibited the problem) and current HEAD. However, I'm not sure if that
also means that whatever caused the failures on TestGrid was also fixed.
Read on. Here is the path of the code that shows how "Connected to %s of
%s known storage servers" was produced at version [3897]:
[source:src/allmydata/web/welcome.xhtml at 3897#L55]
[source:src/allmydata/web/root.py at 3897#L240]
[source:src/allmydata/introducer/client.py at 3897#L277]
Here is how it is produced today:
[source:src/allmydata/web/welcome.xhtml at 3997#L55]
[source:src/allmydata/web/root.py at 3997#L240]
[source:src/allmydata/storage_client.py at 3997#L41]
The old way could potentially have lots of tuples of {{{(serverid,
servicename, rref)}}} for the same serverid and servicename, if new
connections were established to the same serverid but the old connection
was not lost, i.e. {{{notifyOnDisconnect()}}} wasn't getting called. The
new way cannot possibly have an inconsistency between the number of known
storage servers and the number of connected storage servers, since both
are computed from the same dict -- the known servers are all the items of
the dict and the connected storage servers are the ones that have an rref.
Brian: what do you think about {{{notifyOnDisconnect()}}} not getting
called even while new connections to the same foolscap peer are being
established? That's the only explanation I can see for the observed
miscounting on the web gateway that was running allmydata-tahoe [3897] and
foolscap 0.4.1.
--
Ticket URL: <http://allmydata.org/trac/tahoe/ticket/653#comment:11>
tahoe-lafs <http://allmydata.org>
secure decentralized file storage grid
More information about the tahoe-dev
mailing list