User Tools

Site Tools


resolution_area:watchdog_resolutions:res-w9104

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
resolution_area:watchdog_resolutions:res-w9104 [2021/06/29 16:49] btobinresolution_area:watchdog_resolutions:res-w9104 [2021/07/05 12:30] (current) 10.91.120.28
Line 5: Line 5:
  
 **Purpose:** **Purpose:**
 +Alerts if the active alarm table in the SNMP manager database is out of sync with the active alarm table in the alarm cache database.
  
 **Scenario:** **Scenario:**
 +Alarm cache may not be consuming messages. Check if the logs are processing messages. Alarm cache may need a restart and the RabbitMQ queue may need to be purged. 
  
 **Resolution:** **Resolution:**
 +Run the below query to see how out of sync they are. Follow [[http://wiki.err/doku.php?id=development:applications:alarmcache:troubleshooting| this]] wiki article to fix.
 +''select sum(individual_counts) as 'COUNT(*)' from ( (select count(*) as individual_counts from snmp_manager.active_alarm smaa where !smaa.cleared and not (smaa.id in (select id from alarm_cache.active_alarm))) union all (select count(*) from alarm_cache.active_alarm acaa join snmp_manager.active_alarm smaa on smaa.id = acaa.id where smaa.cleared) union all (select count(*) from alarm_cache.active_alarm acaa left join snmp_manager.active_alarm smaa on smaa.id = acaa.id where smaa.id is null) ) as tmp_count_table;''
 +
  
 **Manual Action Steps:** **Manual Action Steps:**
 +Restart alarm cache, purge the RMQ queue and run the alarm cache audit job to fix the sync issue
  
 **Auto Clear:** **Auto Clear:**
 +When the above query's result drops below 800.
  
resolution_area/watchdog_resolutions/res-w9104.1624981762.txt.gz · Last modified: 2021/06/29 16:49 by btobin