Level: Warning
Purpose:
To ensure the services that are supposed to be running stay running
Scenario: SystemD service has crashed for 5m.
Resolution:
The service is running in a stable state
Manual Action Steps:
ssh onto the affected server.
Use `ps aux | grep <app>` to see if the application is still running.
use `systemctl status <app>` to check the status of the application.
If the application is trying to restart over and over open /etc/systemd/system/<app>.service
Edit the `Restart=` line to be off rather than on-failure or always.
Use `sudo journalctl -ex` to see the logs of the server after attempting to restart the application.
A problem for some things in the past that weren't written by Errigal was users required for applications.
ELK stack and MySQL all require an elasticsearch, logstash, kibana and mysql user respectively.
Sometimes just fully shutting down the service with `sudo systemctl stop <app>` for a minute before trying to start it again with `sudo systemctl start <app>` can help the application recover.
If the bash prompt is behaving strangely, the server is likely running out of RAM for some reason.
Another thing worth checking is the disk space. `df -h`
If this is an Errigal app, you can check the logs at moros.err:5601/app/kibana to see the application logs before the service died.
Auto Clear:
Its entirely possible the service will automatically recover