User Tools

Site Tools


resolution_area:prometheus_resolutions:res-p1101

AppDown

Level: Critical FIXME

Purpose: To monitor application status and alert if it is not responsive

Scenario: <application> on <server> has been down for more than 120s.

Resolution: Restart the application

Manual Action Steps: If the application is one of the grails apps then use the start/stop.sh scripts in the /var/tomcat/$application/bin directory. Remember to start SNMP Manager as sudo. Alternatively if the application is run as a service like the spring boot apps then you can restart via sudo systemctl restart $application

Auto Clear: Will auto clear when app is responsive

resolution_area/prometheus_resolutions/res-p1101.txt · Last modified: 2021/07/05 11:15 by 10.91.120.28