[tahoe-lafs-trac-stream] [Tahoe-LAFS] #2023: regression coincident with iputil fixes, on FreeBSD and Slackware
Tahoe-LAFS
trac at tahoe-lafs.org
Thu Sep 11 00:45:10 UTC 2014
#2023: regression coincident with iputil fixes, on FreeBSD and Slackware
-------------------------+-------------------------------------------------
Reporter: zooko | Owner: warner
Type: defect | Status: assigned
Priority: normal | Milestone: 1.11.0
Component: code- | Version: 1.10.0
network | Keywords: regression portability iputil
Resolution: | blocks-release
Launchpad Bug: |
-------------------------+-------------------------------------------------
Comment (by warner):
I've got a fix: https://github.com/tahoe-lafs/tahoe-lafs/pull/108
The core issue is that python-2.7.4, 2.7.5, 2.7.6, and 2.7.7 suffer from
python bug 18851 (http://bugs.python.org/issue18851), in which the
stdlib 'subprocess' module closes most/all (unrelated) file descriptors
when `subprocess.call()` fails the `exec()`, such as when the executable
being invoked does not actually exist. There appears to be some
randomness involved: it happened at different times during my tests.
This was fixed in python-2.7.8.
Tahoe's iputil.py uses subprocess.call on many different "ifconfig"-type
executables, most of which don't exist on any given platform (git commit
8e31d66). This results in a lot of random file-descriptor closing, which
(at least during unit tests) tends to clobber important things like Tub
TCP sockets. This seems to be the root cause behind #2121, in which
normal code tries to close already-closed sockets, gets an EBADF ("not a
socket") during node setup, and bails with os.abort(). Since different
platforms have different ifconfigs, some platforms will experience more
failed execs than others, so this bug could easily behave differently on
linux vs freebsd, as well as working normally on python-2.7.8 or 2.7.4.
My proposed fix is to switch to the 'subprocess32' module from PyPI,
which is a backport of the newer 'subprocess' from python3's stdlib. In
python issue 18851, Gregory P. Smith recommends subprocess32 for all
python2 users who would normally use the stdlib version, since it does
not suffer from this bug (and has other bugfixes too).
I'm pretty sure this is the fix for #2121. We could also fix it by
requiring python >= 2.7.8, but that would rule out development and
deployment on the current OS-X release 10.9 (which ships with
python-2.7.5), as well as other platforms. Using subprocess32 seems like
the easiest fix.
I don't know what exactly this ticket (#2023) is doing, since we no
longer have the logfiles that prompted it (note to future bug-reporters:
please copy the relevant part of the logs into the ticket, rather than
relying upon long-term access to buildbot logs). But since some of us
apparently believed that #2121 might be a duplicate, and since it's
plausible that this problem could appear on FreeBSD and Slackware from
that era and not on linux or other python versions, I'm willing to bet
that this change will fix #2023 too.
--
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/2023#comment:13>
Tahoe-LAFS <https://Tahoe-LAFS.org>
secure decentralized storage
More information about the tahoe-lafs-trac-stream
mailing list