#475 closed defect

CPU-watcher munin graph got stuck — at Initial Version

Reported by: warner Owned by:
Priority: minor Milestone: undecided
Component: code-nodeadmin Version: 1.1.0
Keywords: munin statistics Cc:
Launchpad Bug:

Description

We had a problem in one of our webapi nodes which caused it to lock up (it used a lot of memory, and twistd got an error and tried to kill itself, and failed). The node was using 100% CPU for a few minutes.

The problem was that the CPU-watcher kept reporting that 100% CPU to munin for the next day and a half (and the cpu percentanges reported for the other nodes under its supervision were stuck at their previous values too). If the CPU watcher is writing to a file, then we need to change the munin plugin to ignore files that are more than 10 minutes old or something similar.

Change History (0)

Note: See TracTickets for help on using tickets.