Watchdog Agent
Alarm
CRITICAL - WatchdogAgent : errigalWatchdogStateApplicationFailureNotification - WatchdogApplicationFailureAlarm.
Context
This is one of the most important Watchdog alerts you will see. When a Watchdog cannot start it will generate the Watchdog Agent alert. The Watchdog Agent is part of a Watchdog.
Decision
When this Watchdog is received you must:
Please use the following query to check the Alarm has cleared on Atlas
select * from active_alarm where cleared = false and context like '%Watchdog%'
or via Terminal
mysql -uroot -p(add password) -hatlas.err -e "update snmp_manager.active_alarm set cleared = True where cleared is False and context like '%Watchdog%'";
Consequences
If a Watchdog Agent Watchdog is not actioned it could mean we miss an important alert this could happen as follows: