Watchdog Agent

Alarm

CRITICAL - WatchdogAgent : errigalWatchdogStateApplicationFailureNotification - WatchdogApplicationFailureAlarm.

Context

This is one of the most important Watchdog alerts you will see. When a Watchdog cannot start it will generate the Watchdog Agent alert. The Watchdog Agent is part of a Watchdog.

Decision

When this Watchdog is received you must:

Please use the following query to check the Alarm has cleared on Atlas

select * from active_alarm where cleared = false and context like '%Watchdog%'

or via Terminal

mysql -uroot -p(add password) -hatlas.err -e "update snmp_manager.active_alarm set cleared = True where cleared is False and context like '%Watchdog%'";

Consequences

If a Watchdog Agent Watchdog is not actioned it could mean we miss an important alert this could happen as follows: