#3926 closed enhancement (fixed)

pid-file with creation-time

Reported by: meejah Owned by:
Priority: normal Milestone: undecided
Component: unknown Version: n/a
Keywords: Cc:
Launchpad Bug:

Description

As a consumer of "pid files", there are several cases to consider:

  • no PID file: nothing is running
  • pid file exists:
    • is there a currently running process with that PID?
    • no: stale file
    • yes: difficult (see below)

The case where a pid-file exists and points at a currently-running process is "difficult". BSD and many Linux programs assume the worst and simply exist (that is, they assume the program is the daemon in question and don't want to run two copies).

This leaves the decision up to the user: they must decide whether the pidfile is correct (in which case, kill the offending process) or if the pidfile is incorrect (in which case, delete the pidfile).

If that "user" is some parent process (e.g. Gridsync), it is especially hard. The "name" of the process can change. The "command-line" may be different, depending how it was run (and can be changed at runtime too, I believe).

One reliable indicator is "process start-time". Even in the case of a recycled PID (that is, a program that _isn't_ tahoe but happens to have the same PID) we can tell if it's "the tahoe process" (the creation-time will match) or something else (the creation-time will be later).

So, Gridsync desires a PID-file that includes the process start-time. Since our current twistd.pid file is written by Twisted machinery we can't easily change that -- and arguably it "is an API" so adding a timestamp breaks it.

I propose adding a new option --process-file which will write a file in the node-directory called running.process which will include both the PID and creation-time. (When we tackle https://tahoe-lafs.org/trac/tahoe-lafs/ticket/3925 this file will be the _only_ PID-file).

Change History (2)

comment:1 Changed at 2022-09-15T21:58:33Z by meejah

This seems a bit harder than expected.

We can convince twistd infrastructure to not write a pidfile by setting --pidfile to None. However, we integrate via twisted.internet.app.runApp which is a function that doesn't return (because it calls _exitWithSignal and kills us).

This means we can't use try-finally or a context manager. We can't use atexit (because it doesn't play nicely with signals). A facility like reactor.addSystemEventTrigger("during", "shutdown", ...) could work, but we don't have the reactor before calling runApp (and if we import it, --reactor option won't work).

comment:2 Changed at 2022-10-05T20:43:18Z by meejah

  • Resolution set to fixed
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.