User Tools

Site Tools


resolution_area:prometheus_resolutions:res-p9129

This is an old revision of the document!


ElastAlertNoScrape

Level: Critical

Purpose: Notify Errigal Operations staff that the elastalert application on prometheus prod has not completed a scrape on the DF-80 (80% disk full) or DF-60 (60% disk full) Elast Alert rules in the last 2 minutes.

Resolution: Elastalert runs in a docker container on prometheus prod. It is monitoring NUCS Disk space in KLAs network by checking elastic search data that has pulled the Operating system stats. If these dont catch log files filling up the HDD the NUC will go offline.

Manual Action Steps:

sudo docker logs --tail 100 elastalert --follow
sudo docker restart elastalert

Observe logs and be sure that the logs show elastalert running the query for both DF-80 and DF-60. These two lines should be present.

<code>INFO:elastalert:Queried rule df-80 from 2024-08-06 19:41 UTC to 2024-08-06 19:52 UTC: 0 / 0 hits<code>

<code>INFO:elastalert:Queried rule df-60 from 2024-08-06 19:41 UTC to 2024-08-06 19:53 UTC: 0 / 0 hits<code>

Auto Clear: Will auto clear if the elastalert_scrapes_total counter increases again

resolution_area/prometheus_resolutions/res-p9129.1722974031.txt.gz · Last modified: 2024/08/06 20:53 by 10.91.120.100