#1524 assigned defect

twistd can fail when checking whether a twistd.pid is stale

Reported by: davidsarah Owned by: davidsarah
Priority: major Milestone: undecided
Component: code-nodeadmin Version: 1.9.0a1
Keywords: twistd reliability usability tahoe-start unix Cc:
Launchpad Bug:

Description

After a failed hibernate/restore, I tried to restart some Tahoe nodes:

davidsarah@shinier:~/tahoe/1.9alpha$ bin/tahoe start ../grid/server1
STARTING '/home/davidsarah/tahoe/grid/server1'
Removing stale pidfile /home/davidsarah/tahoe/grid/server1/twistd.pid

davidsarah@shinier:~/tahoe/1.9alpha$ bin/tahoe start ../grid/server2
STARTING '/home/davidsarah/tahoe/grid/server2'
Can't check status of PID 2015 from pidfile /home/davidsarah/tahoe/grid/server2/twistd.pid: Operation not permitted

davidsarah@shinier:~/tahoe/1.9alpha$ ps -Fp 2015
UID        PID  PPID  C    SZ   RSS PSR STIME TTY          TIME CMD

davidsarah@shinier:~/tahoe/1.9alpha$ bin/tahoe start ../grid/server2
STARTING '/home/davidsarah/tahoe/grid/server2'
Can't check status of PID 2015 from pidfile /home/davidsarah/tahoe/grid/server2/twistd.pid: Operation not permitted

davidsarah@shinier:~/tahoe/1.9alpha$ ls -l ../grid/server2/twistd.pid
-rw-r--r-- 1 davidsarah davidsarah 4 2011-09-02 03:32 ../grid/server2/twistd.pid

davidsarah@shinier:~/tahoe/1.9alpha$ cat -v ../grid/server2/twistd.pid
2015

There was no process with pid 2015. (Note that the ps command would show a process owned by another user including root.) I don't know why twistd is able to remove some stale pidfiles but not others.

The problem can be worked around by removing the twistd.pid file, but that's not really very satisfactory.

I'll file a bug against Twisted when I've investigated further.

Change History (4)

comment:1 Changed at 2011-09-02T16:29:56Z by davidsarah

  • Owner set to davidsarah
  • Status changed from new to assigned

comment:2 follow-up: Changed at 2011-09-02T16:33:32Z by warner

maybe the pidfile was owned by a different user? perhaps tahoe was accidentally started as root the previous time?

comment:3 in reply to: ↑ 2 ; follow-up: Changed at 2011-09-02T16:58:09Z by davidsarah

Replying to warner:

maybe the pidfile was owned by a different user? perhaps tahoe was accidentally started as root the previous time?

No, the pidfile is owned and writeable by davidsarah.

comment:4 in reply to: ↑ 3 Changed at 2011-09-02T17:02:49Z by davidsarah

Replying to davidsarah:

No, the pidfile is owned and writeable by davidsarah.

... and davidsarah also has rwx permissions on the parent directory, /home/davidsarah/tahoe/grid/server2.

Note: See TracTickets for help on using tickets.