User Tools

Site Tools


Writing /app/www/public/data/meta/resolution_area/prometheus_resolutions/res-p1111.meta failed
resolution_area:prometheus_resolutions:res-p1111

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
resolution_area:prometheus_resolutions:res-p1111 [2021/06/24 14:32] btobinresolution_area:prometheus_resolutions:res-p1111 [2021/12/23 14:21] (current) 10.91.120.28
Line 1: Line 1:
 +=====Watchdog (#)=====
  
 +**Level:** Critical :?:
 +
 +
 +**Purpose:** Alerts operations that the watchdog agent is not running on a server this can lead to support missing critical alerts.
 +
 +**Scenario:**
 +When a Watchdog cannot start it will generate the Watchdog Agent alert.
 +There is an active alarm on the Watchdog Agent.
 +
 +
 +**Resolution:**
 +Check if watchdog is running on server, restart if necessary. 
 +Logs are available at watchdog/logs
 +
 +**Manual Action Steps:**
 +Check for uncleared active alarm on Cerberus server
 +<code>select * from active_alarm where cleared = false and context like '%Watchdog%'</code>
 +
 +Manually clear Watchdog Agent alarm.
 +<code>update snmp_manager.active_alarm set cleared = True where cleared is False and context like '%Watchdog%'</code>
 +
 +
 +**Auto Clear:**
 +Will clear when issue has been resolved.