[tahoe-lafs-trac-stream] [Tahoe-LAFS] #4186: One server process did not start on testgrid due to PID File collision
Tahoe-LAFS
trac at tahoe-lafs.org
Wed Aug 20 14:18:56 UTC 2025
#4186: One server process did not start on testgrid due to PID File collision
---------------------------+---------------------------
Reporter: hacklschorsch | Owner:
Type: defect | Status: new
Priority: normal | Milestone: undecided
Component: unknown | Version: n/a
Keywords: | Launchpad Bug:
---------------------------+---------------------------
After a HW failure and rebooting, one of the storage node servers on the
testgrid did not start because the PID was already in use.
This issue seems two-fold:
1. PIDfiles are still used even when they shouldn't be. I thought I had
turned them off by setting `pidfile=` (empty), because systemd does not
need them. Seems that does not work.
2. The PIDfile mechanism should compare PID *and* start time ([https
://tahoe-lafs.readthedocs.io/en/latest/running.html#multiple-instances|as
documented) but it seems the recorded start time does not help to discern
the running PID from own process.
Snippet:
{{{
Aug 20 04:52:48 testgrid tahoe[5284]: ERROR: A process is already running
as PID 738
Aug 20 04:52:48 testgrid tahoe[5284]: 'tahoe run' in '/var/lib/tahoe-
lafs/alpha'
}}}
-----
Full systemctl status output:
{{{
[root at testgrid:~]# systemctl status tahoe.alpha.service
× tahoe.alpha.service - Tahoe LAFS node alpha
Loaded: loaded (/etc/systemd/system/tahoe.alpha.service; enabled;
preset: ignored)
Active: failed (Result: exit-code) since Wed 2025-08-20 04:52:48 UTC;
1min 21s ago
Duration: 1.533s
Invocation: debbaab77c594b87983e657284f8d1b1
Process: 5280 ExecStartPre=/nix/store
/sgw152fwaab21wr3fch4ig8cqcz3nw3n-unit-script-tahoe.alpha-pre-
start/bin/tahoe.alpha-pre-start>
Process: 5284 ExecStart=/nix/store/v0a0359ssg6avrf34a06kaz02cmz860p-
python3-tahoe-lafs/bin/tahoe run --allow-stdin-close $STATE_DI>
Main PID: 5284 (code=exited, status=1/FAILURE)
IP: 0B in, 0B out
IO: 0B read, 0B written
Mem peak: 61.6M
CPU: 1.514s
Aug 20 04:52:46 testgrid systemd[1]: Starting Tahoe LAFS node alpha...
Aug 20 04:52:46 testgrid systemd[1]: Started Tahoe LAFS node alpha.
Aug 20 04:52:48 testgrid tahoe[5284]: ERROR: A process is already running
as PID 738
Aug 20 04:52:48 testgrid tahoe[5284]: 'tahoe run' in '/var/lib/tahoe-
lafs/alpha'
Aug 20 04:52:48 testgrid systemd[1]: tahoe.alpha.service: Main process
exited, code=exited, status=1/FAILURE
Aug 20 04:52:48 testgrid systemd[1]: tahoe.alpha.service: Failed with
result 'exit-code'.
Aug 20 04:52:48 testgrid systemd[1]: tahoe.alpha.service: Consumed 1.514s
CPU time, 61.6M memory peak.
}}}
--
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/4186>
Tahoe-LAFS <https://Tahoe-LAFS.org>
secure decentralized storage
More information about the tahoe-lafs-trac-stream
mailing list