User Tools

Site Tools


Writing /app/www/public/data/meta/resolution_area/prometheus_resolutions/res-p1402.meta failed
resolution_area:prometheus_resolutions:res-p1402

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
resolution_area:prometheus_resolutions:res-p1402 [2021/06/24 14:20] btobinresolution_area:prometheus_resolutions:res-p1402 [2021/12/21 11:10] (current) wflaherty
Line 1: Line 1:
 +=====CriticalDiskSpace=====
  
 +**Level:** __Critical__ FIXME
 +
 +
 +**Purpose:** To ensure the server doesn't crash due to a filled disk.
 +
 +**Scenario:** <server> of job <application> has had less than 10% space remaining for 4m.
 +
 +**Resolution:** More than 10% of the disk is free.
 +
 +**Manual Action Steps:**
 +
 +Depending on the size of the disk and the role of the server, this may be very serious or not as serious.
 +If there are still 400GB of a 4TB disk then its not so bad. However, if the disk is rapidly filling and might reach max capacity, we will have to be quick and stop any application that might be doing that.
 +
 +Ensure you check the size of the log files.
 +
 +See some of the steps outlined [[http://wiki.err/doku.php?id=resolution_area:prometheus_resolutions:res-p1403|here]].
 +
 +
 +
 +**Auto Clear:** Possible.