#1159 closed defect (fixed)

stop using .tac files: make it possible to change appname, Python package-directory name, perhaps other names

Reported by: davidsarah Owned by: daira
Priority: major Milestone: 1.10.1
Component: code-nodeadmin Version: 1.8β
Keywords: test-needed backward-compatibility forward-compatibility appname tac needs-spawn packaging setuptools review-needed Cc:
Launchpad Bug:

Description (last modified by daira)

(Reported to tahoe-dev by Nathan Eisenberg.)

In the 1.8beta release, when a user attempts to start a node using a node directory generated by an earlier version (or trunk), it will fail to start and will report a traceback similar to:

~/tahoe-lafs-ticket798-1.7.1-r4653/bin# ./tahoe start
STARTING '/root/.tahoe'
Traceback (most recent call last):
  File "/root/tahoe-lafs-ticket798-1.7.1-r4653/Twisted-10.1.0-py2.5-linux-x86_64.egg/twisted/application/app.py", line 626, in run
    runApp(config)
  File "/root/tahoe-lafs-ticket798-1.7.1-r4653/Twisted-10.1.0-py2.5-linux-x86_64.egg/twisted/scripts/twistd.py", line 23, in runApp
    _SomeApplicationRunner(config).run()
  File "/root/tahoe-lafs-ticket798-1.7.1-r4653/Twisted-10.1.0-py2.5-linux-x86_64.egg/twisted/application/app.py", line 374, in run
    self.application = self.createOrGetApplication()
  File "/root/tahoe-lafs-ticket798-1.7.1-r4653/Twisted-10.1.0-py2.5-linux-x86_64.egg/twisted/application/app.py", line 439, in createOrGetApplication
    application = getApplication(self.config, passphrase)
--- <exception caught here> ---
  File "/root/tahoe-lafs-ticket798-1.7.1-r4653/Twisted-10.1.0-py2.5-linux-x86_64.egg/twisted/application/app.py", line 450, in getApplication
    application = service.loadApplication(filename, style, passphrase)
  File "/root/tahoe-lafs-ticket798-1.7.1-r4653/Twisted-10.1.0-py2.5-linux-x86_64.egg/twisted/application/service.py", line 390, in loadApplication
    application = sob.loadValueFromFile(filename, 'application', passphrase)
  File "/root/tahoe-lafs-ticket798-1.7.1-r4653/Twisted-10.1.0-py2.5-linux-x86_64.egg/twisted/persisted/sob.py", line 210, in loadValueFromFile
    exec fileObj in d, d
  File "tahoe-client.tac", line 5, in <module>
    pkg_resources.require('allmydata-tahoe')
  File "/root/tahoe-lafs-ticket798-1.7.1-r4653/setuptools-0.6c16dev.egg/pkg_resources.py", line 666, in require
    needed = self.resolve(parse_requirements(requirements))
  File "/root/tahoe-lafs-ticket798-1.7.1-r4653/setuptools-0.6c16dev.egg/pkg_resources.py", line 565, in resolve
    raise DistributionNotFound(req)  # XXX put more info here
pkg_resources.DistributionNotFound: allmydata-tahoe

This happens because the appname was changed to "tahoe-lafs-ticket798" for the beta [4595/ticket798], but existing .tac files contain references to the previous "allmydata-tahoe" appname. For example, tahoe-client.tac typically contains:

# -*- python -*-

import pkg_resources
pkg_resources.require('allmydata-tahoe')
pkg_resources.require('twisted')
from allmydata import client
from twisted.application import service

c = client.Client()

application = service.Application("allmydata_client")
c.setServiceParent(application)

Note that the appname change has not been made on trunk, only on the ticket798 branch from which the beta was built.

We should therefore withdraw the beta, because otherwise users may generate new node directories that will not work in future versions (due to the opposite problem, i.e. DistributionNotFound: tahoe-lafs-ticket798).

This failed to be detected because there are no tests that check compatibility with a node directory generated by a previous version. To prevent this happening again, we need the test suite to run at least a smoke test with a node directory that looks like one generated by an earlier version (perhaps check one in as a resource, with a big-fat-warning not to change it).

Attachments (1)

notac.diff (10.6 KB) - added by warner at 2012-05-23T07:20:11Z.
stop using contents of .tac files

Download all attachments as: .zip

Change History (54)

comment:1 follow-up: Changed at 2010-08-06T07:26:34Z by zooko

Hm, there are four overlapping namespaces here.

One is the "appname" which is defined by us (the Tahoe-LAFS project) to distinguish different variants or forks of Tahoe-LAFS or independent implementations of the LAFS protocol. We agreed on how to do this versioning in this message to tahoe-dev. The appname is used in in allmydata/__init__.py to construct a variable named __full_version__, which is used in a few places such as storage/server.py. The appname for Tahoe-LAFS has been "allmydata-tahoe" for the last couple of years, and we would like to change the appname to "tahoe-lafs" someday.

Another namespace is the Python "distribution" namespace. (A "distribution" is what the Python world calls a package.) This is a global namespace where everyone on the planet is supposed to avoid choosing colliding names. An important central locus of this namespace is the Python Package Index (which I suppose ought to be called the "Python Distribution Index")—installers such as easy_install and pip, when asked to install a distribution named $DIST, will by default look for it at http://pypi.python.org/pypi/$DIST. The distribution name for Tahoe-LAFS has been "allmydata-tahoe" and we would like to change it to "tahoe-lafs".

The third namespace is the Python "package" namespace. (A "package" is what the Python world calls a directory containing an __init__.py file.) This namespace is global and is populated by all of the Python distributions that you install on your system. Collisions are silently resolved by a complicated algorithm involving your PYTHONPATH and a certain widely disliked hack due originally to setuptools (setuptool #53). The tahoe-client.tac file loads a package named allmydata. We would probably like to change that in the future to tahoe.

Now currently our build scripts ensure that the Python distribution name is always equal to the appname, and the create-node command adds pkg_resources.require(APPNAME) to the tahoe-client.tac file.

The fourth namespace is the scripts and executables in your system, where we currently create a file named tahoe.

One thing that we can do right away to ease this is change create-node to stop putting the call to pkg_resources.require() in the tahoe-client.tac.

comment:2 in reply to: ↑ 1 ; follow-up: Changed at 2010-08-06T20:30:46Z by davidsarah

Replying to zooko:

Hm, there are four overlapping namespaces here.

One is the "appname" which is defined by us (the Tahoe-LAFS project) to distinguish different variants or forks of Tahoe-LAFS or independent implementations of the LAFS protocol. We agreed on how to do this versioning in this message to tahoe-dev. The appname is used in in allmydata/__init__.py to construct a variable named __full_version__, which is used in a few places such as storage/server.py. The appname for Tahoe-LAFS has been "allmydata-tahoe" for the last couple of years, and we would like to change the appname to "tahoe-lafs" someday.

Another namespace is the Python "distribution" namespace. (A "distribution" is what the Python world calls a package.) This is a global namespace where everyone on the planet is supposed to avoid choosing colliding names. An important central locus of this namespace is the Python Package Index (which I suppose ought to be called the "Python Distribution Index")—installers such as easy_install and pip, when asked to install a distribution named $DIST, will by default look for it at http://pypi.python.org/pypi/$DIST. The distribution name for Tahoe-LAFS has been "allmydata-tahoe" and we would like to change it to "tahoe-lafs".

As far as I understood, these were intended to be the same by definition, at least for Python implementations of the protocol. I.e. the "application-version" field in introduction messages is specifying the implementation version (not a protocol version), and the distribution string is also specifying the implementation version, and so we put the latter string into application-version. (Clearly there can be non-Python implementations of the protocol, and they would use whatever is the nearest equivalent of the distribution string.)

Is there any reason for these not to be the same by definition, for Python implementations?

The third namespace is the Python "package" namespace. (A "package" is what the Python world calls a directory containing an __init__.py file.)

No, absolutely not! That's the module namespace. A package is something else entirely; see below.

(Okay, much of the Python world may be inconsistent in the terminology they use, but the official docs are consistent on this point.)

Now currently our build scripts ensure that the Python distribution name is always equal to the appname,

More precisely (and it's important to be precise here), a distribution is named by appname+'-'+version. The things named by an appname alone are called "packages" (hence "Python Package Index"). A distribution contains a particular version of a package.

(Packages are also called "projects" in some of the setuptools documentation, but no-one uses that, not even the rest of the setuptools docs and code.)

and the create-node command adds pkg_resources.require(APPNAME) to the tahoe-client.tac file.

The fourth namespace is the scripts and executables in your system, where we currently create a file named tahoe.

One thing that we can do right away to ease this is change create-node to stop putting the call to pkg_resources.require() in the tahoe-client.tac.

As far as I understand, the ostensible purpose of that call is to ensure that when we import modules named allmydata.*, we're getting them from some version of the allmydata-tahoe package (this doesn't help to ensure that we're getting the right version). On the other hand, I don't think it actually succeeds in doing even that, if another package defines an allmydata module and happens to be earlier on sys.path, which is currently possible. However I think it is also necessary in order for the call to load_entry_point in the CLI wrapper script to work correctly.

(The implementation of pkg_resources.require in zetuptoolz is here. Good luck tracing that :-)

comment:3 follow-up: Changed at 2010-08-06T21:41:30Z by warner

I seem to recall that the pkg_resources.require API is intended to let programs express a dependency on specific versions (or version ranges) of other package/modules, in the actual code that does the import (as opposed to off in some remote setup.py world). Using it without a version number specification seems to be not particularly useful, although I think it probably does serve to tell setuptools to fuss around with PYTHONPATH enough to allow the import to maybe work.

I think I'd like to see a list of all the things in our tree that depend upon PYTHONPATH being set in some particular way (probably bin/tahoe and test_runner), and then see if there's a way to accomplish this by either removing those places or purely by manipulating PYTHONPATH, depending less upon setuptools mechanisms. In particular, I think that twistd has been loadable as a library for several years now, so I think that the subprocess-spawning done by tahoe start/tahoe run can go away, which might remove one of the needs to change PYTHONPATH. (the test_runner code that really does want to spawn a subprocess, because it wants to kill the subprocess, may be replaceable with an os.fork or something, so the child can simply inherit the environment of the parent).

comment:4 Changed at 2010-08-07T04:39:40Z by davidsarah

  • Keywords regression removed
  • Priority changed from critical to major

This bug is not present in the 1.8.0c1 release candidate, but leaving open because we need a test.

comment:5 in reply to: ↑ 2 ; follow-up: Changed at 2010-08-09T00:20:02Z by zooko

Replying to davidsarah:

Replying to zooko:

One is the "appname" which is defined by us (the Tahoe-LAFS project) to distinguish different variants or forks of Tahoe-LAFS or independent implementations of the LAFS protocol. Another namespace is the Python "distribution" namespace. (A "distribution" is what the Python world calls a package.)

As far as I understood, these were intended to be the same by definition, at least for Python implementations of the protocol. I.e. the "application-version" field in introduction messages is specifying the implementation version (not a protocol version), and the distribution string is also specifying the implementation version, and so we put the latter string into application-version. (Clearly there can be non-Python implementations of the protocol, and they would use whatever is the nearest equivalent of the distribution string.)

Agreed.

Is there any reason for these not to be the same by definition, for Python implementations?

I can't think of any at the moment.

The third namespace is the Python "package" namespace. (A "package" is what the Python world calls a directory containing an __init__.py file.)

No, absolutely not! That's the module namespace. A package is something else entirely; see below.

(Okay, much of the Python world may be inconsistent in the terminology they use, but the official docs are consistent on this point.)

What docs are you referring to? I really wish that there was some official justification to use "package" to mean a package, but sadly I was unable to persuade the distutils-sig mailing list to get behind a full-scale renaming. Therefore we are currently stuck with this:

http://guide.python-distribute.org/glossary.html

Distribution

A Python distribution is a versioned compressed archive file that contains Python packages, modules, and other resource files. The distribution file is what an end-user will download from the internet and install.

A distribution is often also called a package. This is the term commonly used in other fields of computing. For example, Mac OS X and Debian call these files package files. However, in Python, the term package refers to an importable directory. In order to distinguish between these two concepts, the compressed archive file containing code is called a distribution.

However, it is not uncommon in Python to refer to a distribution using the term package. While the two meanings of the term package is not always 100% unambigous, the context of the term package is usually sufficient to distinguish the meaning of the word. For example, the python installation tool pip is an acronym for “pip installs packages”, while technically the tool installs distributions, the name package is used as it’s meaning is more widely understood. Even the site where distributions are distributed at is called the Python Package Index (and not the Python Distribution Index).

(Packages are also called "projects" in some of the setuptools documentation, but no-one uses that, not even the rest of the setuptools docs and code.)

That might be our only hope -- the word "project" isn't already used to mean something else within Python, so maybe we could start consistently using it to mean all-versions-of the source that is used to produce distributions (each of which has a version number).

By the way I think that Brian in comment:3 is showing that he is confused about the difference between distributions (sometimes called packages, especially by the rest of the world outside of Python), packages (directories that have __init__.py files in them), and modules (importable bundles of Python code). The pkg_resources API is intended to express a dependency on a project, i.e. a range (possibly "any") of versions of distributions of that project. (Note the name begins with "pkg", which stands for "distribution". Ha ha.)

David-Sarah wrote in comment:2:

As far as I understand, the ostensible purpose of that call is to ensure that when we import modules named allmydata.*, we're getting them from some version of the allmydata-tahoe package

My purpose in adding that call to pkg_resources was to ensure that if the distribution ("package"? "project"?) was not installed that we would get an explicit error at that point instead of proceding. Actually, it was more to ensure that the transitive closure of the install_requires of the allmydata-tahoe distribution were installed. It has been a long time since I've felt like I needed that double-check and I don't object to removing it.

Last edited at 2010-08-09T00:20:35Z by zooko (previous) (diff)

comment:6 in reply to: ↑ 5 Changed at 2010-08-09T05:49:10Z by davidsarah

Replying to zooko:

Replying to davidsarah:

Replying to zooko:

[...]

The third namespace is the Python "package" namespace. (A "package" is what the Python world calls a directory containing an __init__.py file.)

No, absolutely not! That's the module namespace. A package is something else entirely; see below.

(Okay, much of the Python world may be inconsistent in the terminology they use, but the official docs are consistent on this point.)

What docs are you referring to?

I meant that the python.org docs are consistent in not using "package" when they mean "module" (at least I haven't seen any counterexamples on python.org).

I really wish that there was some official justification to use "package" to mean a package, but sadly I was unable to persuade the distutils-sig mailing list to get behind a full-scale renaming. Therefore we are currently stuck with this:

http://guide.python-distribute.org/glossary.html

Distribution

A Python distribution is a versioned compressed archive file that contains Python packages, modules, and other resource files. The distribution file is what an end-user will download from the internet and install.

OK, that's not too bad.

A distribution is often also called a package. This is the term commonly used in other fields of computing. For example, Mac OS X and Debian call these files package files. However, in Python, the term package refers to an importable directory.

Gahh. This is completely wrong and unnecessarily confusing. Pretend they didn't say that.

The right term for the latter concept is "module directory".

(Module code can also be stored in zipfiles, so this isn't necessarily a directory in a filesystem.)

[...]

(Packages are also called "projects" in some of the setuptools documentation, but no-one uses that, not even the rest of the setuptools docs and code.)

That might be our only hope -- the word "project" isn't already used to mean something else within Python, so maybe we could start consistently using it to mean all-versions-of the source that is used to produce distributions (each of which has a version number).

I've no objection to using "project", if you think "package" is ambiguous.

[...] The pkg_resources API is intended to express a dependency on a project, i.e. a range (possibly "any") of versions of distributions of that project.

The strings specifying a package/project possibly with a version constraint, e.g. "allmydata-tahoe>=1.7.0", are called "requirements".

David-Sarah wrote in comment:2:

As far as I understand, the ostensible purpose of that call is to ensure that when we import modules named allmydata.*, we're getting them from some version of the allmydata-tahoe package

My purpose in adding that call to pkg_resources was to ensure that if the distribution ("package"? "project"?)

The argument to pkg_resources.require is a requirement. In this case it is just a requirement for an unspecified version of the allmydata-tahoe project, which is not all that useful, I think.

was not installed that we would get an explicit error at that point instead of proceding. Actually, it was more to ensure that the transitive closure of the install_requires of the allmydata-tahoe distribution were installed.

I don't think it actually checks that.

It has been a long time since I've felt like I needed that double-check and I don't object to removing it.

Let's try that (for after 1.8.0) and see if it breaks anything.

comment:7 follow-up: Changed at 2010-08-09T20:09:19Z by warner

The argument to pkg_resources.require is a requirement. In this case it is just a requirement for an unspecified version of the allmydata-tahoe project, which is not all that useful, I think.

was not installed that we would get an explicit error at that point instead of proceding. Actually, it was more to ensure that the transitive closure of the install_requires of the allmydata-tahoe distribution were installed.

I don't think it actually checks that.

My vague recollection is that it does, because I think I remember having a Foolscap installed in such a way that 'import foolscap' worked fine but setuptools/pkg_resources didn't believe it was present (or wasn't of the required version), and that commenting out the pkg_resources.require call was necessary to make it stop throwing NoSuchDistribution exceptions or the like.

I wasn't too thorough about checking that, so I could be wrong.

comment:8 in reply to: ↑ 7 Changed at 2010-08-14T06:39:35Z by zooko

Replying to warner:

Actually, it was more to ensure that the transitive closure of the install_requires of the allmydata-tahoe distribution were installed.

I don't think it actually checks that.

My vague recollection is that it does,

Yeah, it does:

MAIN Zooko-Ofsimplegeos-MacBook-Pro:~$ tahoe --version
allmydata-tahoe: 1.8.0c2-r4698, foolscap: 0.5.1, pycryptopp: 0.5.17, zfec: 1.4.5, Twisted: 10.1.0-r29874, Nevow: 0.10.0, zope.interface: 3.5.1, python: 2.6.1, platform: Darwin-10.4.0-i386-64bit, sqlite: 3.6.12, simplejson: 2.1.1, distribute: 0.6.13, argparse: 1.1, pycrypto: 2.0.1, pyOpenSSL: 0.7, pyutil: 1.7.10, zbase32: 1.1.2, setuptools: 0.6, pyasn1: 0.0.11a, pysqlite: 2.4.1
MAIN Zooko-Ofsimplegeos-MacBook-Pro:~$ verinfo pyasn1
pkg_resources.require('pyasn1') =>  [pyasn1 0.0.11a (/Library/Python/2.6/site-packages/pyasn1-0.0.11a-py2.6.egg)]
import pyasn1;print pyasn1 =>  <module 'pyasn1' from '/Library/Python/2.6/site-packages/pyasn1-0.0.11a-py2.6.egg/pyasn1/__init__.pyc'>
import pyasn1;print pyasn1.__version__ =>  False
MAIN Zooko-Ofsimplegeos-MacBook-Pro:~$ sudo /bin/rm -rf /Library/Python/2.6/site-packages/pyasn1-0.0.11a-py2.6.egg
Password:
MAIN Zooko-Ofsimplegeos-MacBook-Pro:~$ tahoe --version
Traceback (most recent call last):
  File "/usr/local/bin/tahoe", line 6, in <module>
    from pkg_resources import load_entry_point
  File "/Library/Python/2.6/site-packages/distribute-0.6.13-py2.6.egg/pkg_resources.py", line 2671, in <module>
    working_set.require(__requires__)
  File "/Library/Python/2.6/site-packages/distribute-0.6.13-py2.6.egg/pkg_resources.py", line 654, in require
    needed = self.resolve(parse_requirements(requirements))
  File "/Library/Python/2.6/site-packages/distribute-0.6.13-py2.6.egg/pkg_resources.py", line 552, in resolve
    raise DistributionNotFound(req)
pkg_resources.DistributionNotFound: pyasn1>=0.0.8a

[edit by davidsarah: this test is misleading. it's not the call to pkg_resources.require('allmydata-tahoe') in the tac file that performs this check; the check is done as a side-effect of "import pkg_resources" after setting __requires__ in the setuptools-generated support script. All explicit calls to pkg_resources.require for package names that were already in __requires__ are redundant.]

Last edited at 2011-08-22T14:45:05Z by davidsarah (previous) (diff)

comment:9 follow-up: Changed at 2010-08-14T06:43:22Z by zooko

  • Milestone changed from 1.8.0 to eventually
  • Summary changed from 1.8beta is incompatible with existing node directories due to change of appname to make it possible to change appname, Python package-directory name, perhaps other names

What are the next steps on this ticket? I think it is to remove the pkg_resources.require() in in tahoe-client.tac, and possibly to make a facility to overwrite tahoe-client.tac with a new version on install so that when people install Tahoe-LAFS v1.9.0 they get this new version.

comment:10 in reply to: ↑ 9 Changed at 2010-09-12T23:04:06Z by davidsarah

Replying to zooko:

What are the next steps on this ticket? I think it is to remove the pkg_resources.require() in in tahoe-client.tac, and possibly to make a facility to overwrite tahoe-client.tac with a new version on install so that when people install Tahoe-LAFS v1.9.0 they get this new version.

How would we know where all of the .tac files are on install? I think that to be sufficiently compatible, we'd have to work correctly even when the .tac file used when starting a node is one that we've never seen before.

Someone please explain what the benefit is of using .tac files at all? Is that just "the way it's done" in Twisted?

comment:11 Changed at 2010-10-27T22:35:53Z by warner

Re .tac files:

It's probably time for them to die. When I wrote the very first get-a-client-running framework (30-Nov-2006, wow), I copied the scheme I'd used in buildbot, since it was an easy way to get a daemon process running.

The original idea of .tac files (and .tap and .tax files, their black-sheep siblings) was to decouple the implementation of a service from the deployment. From a developer's point of view, it's pretty easy to create a service in the Twisted world: you make a subclass of twisted.application.service.Service, override startService/stopService to control startup/shutdown, and connect other services in a tree shape with setServiceParent.

To actually start this, the cheap "__main__" way is to use service.startService(); reactor.run(), and the cheap way to configure it is to pass arguments into your service's constructor. But ops folks (or, you know, *users*) who want to run your service don't want to write python code. And they might want to configure the service without writing python code.

The "official" way to do this, at least as of 2006, was a Twisted tool named 'mktap'. This uses plugins (which you write just after you write your Service subclass) to transform command-line arguments (a twisted.python.usage.Usage instance) into a fully-configured Service instance, and then serializes (with pickle) the Service out into a .tap file. Then, later, you use 'twistd foo.tap' to thaw out the Service and start it running (properly daemonized and setuided and all). twistd would also serialize the Service object when you quit the process, so you could achieve persistence by running 'twistd foo-shutdown.tap' on subsequent runs.

Eventually twistd learned how to use the mktap plugins directly, so I think you can run twistd web --path . to do the same thing as mktap web --path . ; twistd web.tap but without the leftover .tap file.

But, pickle files are a drag, and the auto-persistence -shutdown.tap feature turned out to be a bad idea. Once you stop using pickle, then having a non-human-readable configuration/deployment file is a drag. So .tac (which is a python program that provides a distinguished 'application' symbol, and is evaluated by twistd) becomes reasonable.

Anyways, the benefit of writing out a .tac file is that you could use the generic twistd to launch the service (assuming that everything is on your PYTHONPATH), which gives you a bunch of maybe-useful options for free: --syslog, --profile, --rundir, --chroot, --uid, --reactor. There are also some tools which make it easy to launch twistd-style jobs from /etc/init.d/ initscripts, or from mac LaunchAgent plists.

But, now that twistd is invokable as a library, it's easier to get those free arguments without needing to duplicate them all in the 'tahoe start' option parser. If we commit to having bin/tahoe be the only way to launch tahoe, then it can set up sys.path before importing twisted.scripts.twistd.

So maybe what's left is to define a data file (perhaps named client.tac, for backwards compatibility), from which tahoe start can determine that it ought to instantiate an allmydata.client.Client instead of an allmydata.introducer.IntroducerNode. And then change tahoe start to do whatever pkg_resources.require might really really be necessary before scanning the data file and doing the conditional import.

Or maybe tahoe start could look for "tahoe.cfg" to decide if the target directory is really a tahoe nodedir, look either for a key in it or the presence of client.tac/introducer.tac to decide what top-level Service to instantiate, but then otherwise ignore the .tac file. And new versions of tahoe create-node could write the nodetype= key into tahoe.cfg and *not* write the .tac file.

comment:12 follow-up: Changed at 2010-10-27T23:33:43Z by warner

Oh, one caveat: since we delegate daemonization/etc to twistd, we're limited in the set of entry points we can use. The easiest one is twisted.scripts.twistd.run, for which the arguments are effectively sys.argv, which means it really wants to find a .tac file. What we want is a way to build the twisted.application.service.Application or .MultiService instance and hand it fully-formed into twistd. So we'd need to find a stable/reasonable point-of-entry into the twistd code that accepts a Service but is still before the daemonization/logging/etc setup.

comment:13 Changed at 2010-10-27T23:41:33Z by warner

one hack might be to set up the plugins table, then set argv to something with a "subcommand" which vectors into a special tahoe-Service-making "plugin" (which really doesn't exist anywhere else on the filesystem, only in RAM). Another would be to somehow override twisted.application.app.ApplicationRunner.createOrGetApplication, possibly by monkeypatching twisted.scripts.twistd._SomeApplicationRunner, which sounds like a bad idea. (ApplicationRunner is the parent class of both WindowsApplicationRunner and UnixApplicationRunner, and twisted/scripts/twistd.py is responsible for loading the right one).

It might also be possible to grab control by wrapping the runApp in twistd.py before it is passed to app.run. I've not yet investigated whether that would help, though, it might be too late.

comment:14 Changed at 2010-10-28T05:54:50Z by zooko

I'm not completely following the details, but the general idea sounds great to me.

comment:15 Changed at 2010-11-27T09:49:47Z by warner

  • Component changed from code to code-nodeadmin
  • Summary changed from make it possible to change appname, Python package-directory name, perhaps other names to stop using .tac files: make it possible to change appname, Python package-directory name, perhaps other names

I just started to write up a duplicate ticket.. fixing up the tags on this one so I can find it next time.

comment:16 Changed at 2010-11-27T20:31:37Z by davidsarah

  • Keywords appname tac added

comment:17 in reply to: ↑ 12 Changed at 2011-01-22T23:35:56Z by davidsarah

Replying to warner:

Oh, one caveat: since we delegate daemonization/etc to twistd, we're limited in the set of entry points we can use. The easiest one is twisted.scripts.twistd.run, for which the arguments are effectively sys.argv, which means it really wants to find a .tac file. What we want is a way to build the twisted.application.service.Application or .MultiService instance and hand it fully-formed into twistd. So we'd need to find a stable/reasonable point-of-entry into the twistd code that accepts a Service but is still before the daemonization/logging/etc setup.

Or we can put a .tac file in the source tree, and direct twistd to that one (using something like os.path.join(os.path.dirname(allmydata.__file__), 'tahoe_node.tac')), rather than to the one in the node directory. Even though it doesn't "stop using .tac files", that would solve the problems in this ticket because the in-source .tac can change without any compatibility issues.

This depends on the Tahoe source not being zipped, but we already say in setup.py that it is not zip-safe.

comment:18 Changed at 2011-08-22T16:34:48Z by davidsarah

Elrond on #tahoe-lafs said that they had a problem starting nodes with a (slightly modified) version of the Tahoe-LAFS 1.8.2-3 Debian package, due to the pkg_resources.require('twisted') in the .tac file. Apparently twisted had been manually installed in a way that wasn't recognized by pkg_resources (possibly due to http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=631163). Commenting out the pkg_resources.require worked.

The output of tahoe --version-and-path was

allmydata-tahoe: 1.8.2 (/usr/lib/pymodules/python2.6),
foolscap: 0.6.1 (/usr/lib/python2.6/dist-packages),
pycryptopp: 0.5.29 (/usr/lib/python2.6/dist-packages),
zfec: 1.4.5 (/usr/lib/python2.6/dist-packages),
Twisted: 10.1.0 (/usr/lib/python2.6/dist-packages),
Nevow: 0.10.0 (/usr/lib/python2.6/dist-packages),
zope.interface: unknown (/usr/lib/python2.6/dist-packages/zope),
python: 2.6.6 (/usr/bin/python),
platform: Linux-debian_6.0.2-i686-32bit_ELF (None),
pyOpenSSL: 0.10 (/usr/lib/pymodules/python2.6),
simplejson: 2.1.1 (/usr/lib/pymodules/python2.6),
pycrypto: 2.1.0 (/usr/lib/python2.6/dist-packages),
pyasn1: unknown (/usr/lib/pymodules/python2.6/pyasn1),
mock: 0.6.0 (/usr/lib/pymodules),
sqlite3: 2.4.1 [sqlite 3.7.6.2] (/usr/lib/python2.6),
setuptools: 0.6 [distribute] (/usr/lib/python2.6/dist-packages)

Warning: dependency twisted (version 10.1.0 imported from '/usr/lib/python2.6/dist-packages') was not found by pkg_resources.

For debugging purposes, the PYTHONPATH was
  None
install_requires was
  ['setuptools >= 0.6c6', 'zfec >= 1.1.0', 'simplejson >= 1.4', 'zope.interface', 'foolscap[secure_connections] >= 0.6.1', 'Nevow >= 0.6.0', 'pycrypto == 2.0.1, == 2.1.0, >= 2.3', 'pyasn1 >= 0.0.8a', 'mock', 'pycryptopp >= 0.5.14']
sys.path after importing pkg_resources was
  /usr/bin:
  /usr/lib/python2.6:
  /usr/lib/python2.6/plat-linux2:
  /usr/lib/python2.6/lib-tk:
  /usr/lib/python2.6/lib-old:
  /usr/lib/python2.6/lib-dynload:
  /usr/local/lib/python2.6/dist-packages:
  /usr/lib/python2.6/dist-packages:
  /usr/lib/python2.6/dist-packages/PIL:
  /usr/lib/pymodules/python2.6:
  /usr/lib/pymodules/python2.6/gtk-2.0:
  /usr/lib/python2.6/dist-packages/wx-2.8-gtk2-unicode
Last edited at 2011-08-22T18:08:20Z by davidsarah (previous) (diff)

comment:19 Changed at 2011-08-22T16:39:02Z by davidsarah

When investigating workarounds, note that we can now depend on Twisted >= 10.1.

comment:20 Changed at 2011-09-22T20:02:40Z by zooko

  • Keywords needs-spawn added

This is one of those tickets that needs to spawn off smaller and more closeable subtickets.

comment:21 Changed at 2011-09-22T20:06:19Z by zooko

  • Keywords packaging setuptools added

I just created #1539 which is a small step we could do (I think) right away and that might ease things in the future.

comment:22 in reply to: ↑ 3 Changed at 2011-09-22T23:56:33Z by zooko

Replying to warner:

In particular, I think that twistd has been loadable as a library for several years now, so I think that the subprocess-spawning done by tahoe start/tahoe run can go away, which might remove one of the needs to change PYTHONPATH. (the test_runner code that really does want to spawn a subprocess, because it wants to kill the subprocess, may be replaceable with an os.fork or something, so the child can simply inherit the environment of the parent).

Brian made this change in ac3b26ecf29c08cb.

comment:23 follow-up: Changed at 2011-09-23T00:04:26Z by zooko

Okay, having re-read this ticket, it sounds like the way forward is for someone to figure out how to use twistd as an importable library instead of as a command that gets executed and then loads our code. See comment:11, comment:12, comment:13, comment:17. Is that the next step?

comment:24 in reply to: ↑ 23 Changed at 2011-09-23T00:16:05Z by davidsarah

Replying to zooko:

Okay, having re-read this ticket, it sounds like the way forward is for someone to figure out how to use twistd as an importable library instead of as a command that gets executed and then loads our code. See comment:11, comment:12, comment:13, comment:17. Is that the next step?

Yes. comment:17 is a way of doing that that will work, but is ugly. I don't know whether there's a more elegant way. (Duplicating some of twistd's daemonizing code would also work but would be even uglier.)

comment:25 Changed at 2012-05-23T05:02:23Z by warner

  • Owner changed from somebody to warner
  • Status changed from new to assigned

I was able to use the first technique from comment:13 in my "toolbed" app (i.e. creating a fake in-RAM "plugin", then a sys.argv that references it). I'll build a patch that changes 'tahoe start' to use that technique so we can give it a spin.

Changed at 2012-05-23T07:20:11Z by warner

stop using contents of .tac files

comment:26 Changed at 2012-05-23T07:21:17Z by warner

  • Keywords review-needed added
  • Owner changed from warner to davidsarah
  • Status changed from assigned to new

davidsarah: How's that look?

comment:27 Changed at 2012-05-23T15:32:30Z by davidsarah

  • Status changed from new to assigned

Will review.

comment:28 Changed at 2012-05-23T15:44:26Z by davidsarah

  • Milestone changed from eventually to 1.10.0

comment:29 Changed at 2013-01-03T16:23:56Z by davidsarah

Starting review.

comment:30 Changed at 2013-01-03T16:35:51Z by davidsarah

The patch applies cleanly to trunk and passes existing tests.

Re: this comment

# this can't handle e.g. 'tahoe start --nodaemon', since then 
# --nodaemon looks like a basedir. So you can either use 'tahoe 
# start' or 'tahoe start BASEDIR --TWISTD-OPTIONS'.

That's OK but the error message that you're likely to get is "--basedir '--nodaemon' doesn't exist", which is not as helpful as it could be.

The patch adjusts some tests but doesn't appear to fully test the change in functionality. Will think about what tests there should be.

Continuing review...

comment:31 Changed at 2013-01-03T16:38:18Z by davidsarah

tahoe start --help says "Usage: tahoe start [options] [NODEDIR], which is wrong if the options should be after the nodedir. (Similarly for restart etc.)

comment:32 Changed at 2013-01-03T16:41:08Z by davidsarah

The twistd options are available but not documented in tahoe start --help. See tahoe debug trial --help for an example of how it is possible to get those options listed.

comment:33 follow-up: Changed at 2013-01-03T16:43:20Z by davidsarah

Does the fact that "# also ("--logfile", "tahoesvc.log")" is commented out mean that there is some regression in logging functionality?

comment:34 in reply to: ↑ 33 ; follow-up: Changed at 2013-01-03T16:56:36Z by davidsarah

Replying to davidsarah:

Does the fact that "# also ("--logfile", "tahoesvc.log")" is commented out mean that there is some regression in logging functionality?

Oh, that applies only to startstop_node.run, and it has the effect of not redirecting stdout/stderr for tahoe run and so fixing #1489. That's a good change, but the comment should be removed or clarified, and/or we should document the change.

Last edited at 2013-01-03T18:47:09Z by davidsarah (previous) (diff)

comment:35 in reply to: ↑ 34 Changed at 2013-01-03T17:00:04Z by davidsarah

Replying to davidsarah:

Oh, that applies only to runner.run, and it has the effect of not redirecting stdout/stderr and so fixing #1489. That's a good change, but the comment should be removed or clarified, and/or we should document the change.

in git/docs/logging.rst for example.

comment:36 Changed at 2013-01-03T17:05:54Z by davidsarah

"XYZ" could probably be something less mysterious like "StartTahoeNode", while still having negligible chance of colliding with any existing subcommand.

comment:37 Changed at 2013-01-03T17:39:42Z by davidsarah

twistd_args.extend(config.twistd_args)
twistd_args.append("XYZ") # point at our NodeStartingPlugin

Could an argument to tahoe that does not start with "--" be confused by twistd for the command name?

If I try tahoe start ~/.tahoe foo (which should be an error), I get a twistd options listing followed by a TypeError traceback, which appears to be due to a bug at startstop_node.py line 114. This change fixes the bug, and snips some implementation details about twistd from the end of the options listing:

-        print >>err, "tahoe start: %s" % (config.subCommand, ue)
-        print >>err, twistd_config
---
+        print >>err, str(twistd_config).partition("\n\n")[0]
+        print >>err
+        print >>err, "tahoe start: %s\n" % (ue,)
Version 0, edited at 2013-01-03T17:39:42Z by davidsarah (next)

comment:38 Changed at 2013-01-03T19:00:11Z by davidsarah

I pushed patches addressing some of the above comments to https://github.com/davidsarah/tahoe-lafs/commits/1159-no-tac.

https://github.com/davidsarah/tahoe-lafs/commit/a7e4cab9c3367048d8ea326f559f65a6accdb8c1 is warner's original patch and subsequent ones are mine. They do not yet address comment:30.

The -d / --node-directory= / -C / --basedir= option isn't working correctly when twistd options are given, because there's nothing to separate the tahoe options from the twistd options. Not sure what to do about that, but it interacts with #166.

Last edited at 2013-01-03T19:02:23Z by davidsarah (previous) (diff)

comment:39 Changed at 2013-03-21T18:44:03Z by warner

  • Milestone changed from 1.10.0 to 1.11.0

I want this, but won't block 1.10 for it. Kicking it and its cousin #1539 into 1.11 . Should be easy to get done early in the next cycle.

comment:40 Changed at 2013-03-23T02:48:04Z by davidsarah

+1 for kicking this out.

comment:41 Changed at 2013-04-23T05:19:32Z by daira

This is a blocker for #1950.

comment:42 Changed at 2013-05-02T02:02:31Z by warner

I've updated my branch with daira's recommended patches. https://github.com/warner/tahoe-lafs/tree/1159-notac

Last edited at 2013-05-03T06:50:24Z by warner (previous) (diff)

comment:43 Changed at 2013-08-28T15:59:17Z by daira

  • Description modified (diff)
  • Milestone changed from soon to 1.11.0

comment:44 Changed at 2014-03-21T18:53:27Z by remyroy

Is the code review completed on this one? Does it need to be updated or done once more?

comment:45 Changed at 2014-07-23T14:46:43Z by daira

zs260: hi, i'm trying to create systemd for tahoe introducer. i got the client working fine, but for introducer, i need to run it in foreground mode, tahoe run, and it complains " looks like it contains a non-client node (tahoe-introducer.tac). Use tahoe start instead of tahoe run"

zs260: introducer works fine with tahoe start, but when i use with systemd, type="forking", it does not work. systemd keeps waiting

daira: oh, zs260 left. I was just about to answer their question

daira: the fact that 'tahoe run' doesn't work for introducers (or key-generators or stats-gatherers) is a ticketed bug

daira: #1159 (also #937 which was closed as a duplicate of #1159)

daira: #1159 is likely to be fixed for 1.11

comment:46 Changed at 2014-09-02T16:33:10Z by daira

  • Owner changed from davidsarah to daira
  • Status changed from assigned to new

comment:47 Changed at 2014-09-02T16:33:44Z by daira

  • Status changed from new to assigned

comment:48 Changed at 2014-09-02T18:14:28Z by warner

daira: if you need me to rebase my branch, please let me know. I'll run tests locally and make sure it hasn't bitrotted.

comment:49 Changed at 2014-09-03T11:30:38Z by daira

  • Owner changed from daira to warner
  • Status changed from assigned to new

Yes please! Reassign me when you're done.

comment:50 follow-up: Changed at 2014-09-05T06:31:04Z by warner

I've rebased the branch (and fixed one test which was failing.. not sure what exactly was happening there, the fix was to revert a test change that I made to allow the original patch to pass). The new branch is here:

https://github.com/warner/tahoe-lafs/tree/1159-notac-2

I think we should land this, and then close #1539 because this branch supercedes that one.

Note that this patch doesn't completely close #1159: it stops using the *contents* of the .tac files, but still relies upon their presence, in particular their filename: if "client" is a substring of the filename of the first .tac file found in the node's basedir, then we instantiate a Client, if "introducer" is in the string, we instantiate an IntroducerNode, etc.

We'll want another patch, after 1.11, to move this indicator into tahoe.cfg (and look for a .tac filename as a backwards-compatibility tool).

comment:51 Changed at 2014-09-05T06:31:19Z by warner

  • Owner changed from warner to daira

comment:53 in reply to: ↑ 50 Changed at 2014-10-27T01:38:29Z by daira

  • Resolution set to fixed
  • Status changed from new to closed

Fixed in 87a6894e6299148d639279fbd909110ba00d0080/trunk.

Replying to warner:

Note that this patch doesn't completely close #1159: it stops using the *contents* of the .tac files, but still relies upon their presence, in particular their filename: if "client" is a substring of the filename of the first .tac file found in the node's basedir, then we instantiate a Client, if "introducer" is in the string, we instantiate an IntroducerNode, etc.

We'll want another patch, after 1.11, to move this indicator into tahoe.cfg (and look for a .tac filename as a backwards-compatibility tool).

Filed as #2325. What we've done with this patch is sufficient to "make it possible to change appname (#2011), Python package-directory name (#1950), perhaps other names".

Note: See TracTickets for help on using tickets.