**Watchdog Agent**
**Alarm**
CRITICAL - WatchdogAgent : errigalWatchdogStateApplicationFailureNotification - WatchdogApplicationFailureAlarm.
**Context**
This is one of the most important Watchdog alerts you will see. When a Watchdog cannot start it will generate the Watchdog Agent alert. The Watchdog Agent is part of a Watchdog.
**Decision**
When this Watchdog is received __you must__:
* First, check that the Watchdog is running and if not take the relevant action.
* Because the Watchdog Agent is not currently set up to clear itself you must manually clear the alarm.
* You will then have to select "Alarm Clear received" on related Ticket(s) too.
* On the Node monitor, you can check the "Review the logs" for Watchdog agent alarms and clear them there.
* You can alternatively clear the alarm via the Database ( This option is preferred in this circumstance)
Please use the following query to check the Alarm has cleared on Atlas
select * from active_alarm where cleared = false and context like '%Watchdog%'
or via Terminal
mysql -uroot -p(add password) -hatlas.err -e "update snmp_manager.active_alarm set cleared = True where cleared is False and context like '%Watchdog%'";
**Consequences**
If a Watchdog Agent Watchdog is not actioned it could mean we miss an important alert this could happen as follows:
* Watchdog is running but has an active alarm on the Watchdog Agent.
* Watchdog fails to start.
* The active alarm on the Watchdog Agent means we would not be alerted to Watchdog failing.
* No Watchdog running for the system in question could lead to Operations not being informed of a Critical system problem.