User Tools

Site Tools


Writing /app/www/public/data/meta/resolution_area/prometheus_resolutions/res-p1110.meta failed
resolution_area:prometheus_resolutions:res-p1110

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
resolution_area:prometheus_resolutions:res-p1110 [2021/06/24 15:07] btobinresolution_area:prometheus_resolutions:res-p1110 [2021/07/05 11:31] (current) 10.91.120.28
Line 1: Line 1:
 +=====PhysicalComponentTooHot=====
  
 +**Level:** Warning :!:
 +
 +
 +**Purpose:**
 +Reports on the physical hardware temperature on the servers.
 +
 +**Scenario:** A physical component is operating outside of a safe temperature (> 75 degrees celsius) for 5m.
 +
 +**Resolution:**
 +Monitor the server and alerts to see if there are any process running that shouldn't be running or are taking too long. Check RAM usage.
 +
 +**Manual Action Steps:**
 +Kill a process that is running unnecessarily. If the alert does not clear, the server might need some maintenance.
 +
 +**Auto Clear:**
 +When server temperature decreases below 75 degrees celsius.