Opened at 2008-06-23T20:33:40Z
Last modified at 2020-12-09T15:00:31Z
#475 closed defect
CPU-watcher munin graph got stuck — at Initial Version
| Reported by: | warner | Owned by: | |
|---|---|---|---|
| Priority: | minor | Milestone: | undecided |
| Component: | code-nodeadmin | Version: | 1.1.0 |
| Keywords: | munin statistics | Cc: | |
| Launchpad Bug: |
Description
We had a problem in one of our webapi nodes which caused it to lock up (it used a lot of memory, and twistd got an error and tried to kill itself, and failed). The node was using 100% CPU for a few minutes.
The problem was that the CPU-watcher kept reporting that 100% CPU to munin for the next day and a half (and the cpu percentanges reported for the other nodes under its supervision were stuck at their previous values too). If the CPU watcher is writing to a file, then we need to change the munin plugin to ignore files that are more than 10 minutes old or something similar.
