User Tools
Writing /app/www/public/data/meta/databaseandnetworkmanagement/aws_ec2_maintenance.meta failed
databaseandnetworkmanagement:aws_ec2_maintenance
Table of Contents
Managing AWS EC2 Maintenance
Author: Padraig O Neill
Introduction
This page was created to provide a resource which documents the process involved in managing upcoming scheduled maintenance of an EC2 instance with AWS.
This page was written post the work on ATC-996430.
AWS Emails
When an EC2 maintenance window is approaching, we will be notified by AWS.
The email will contain something similar to the following:
One or more of your Amazon EC2 instances is scheduled for maintenance on 2018-11-08 for 2 hours starting at 2018-11-08 00:00:00 UTC. During this time, the following instances in the us-east-1 region will be unavailable and then rebooted: i-0ad507630ac3439fa During the scheduled maintenance window, your instance will undergo a reboot. The instance will retain its IP address, DNS name, and any data on local instance-store volumes. Please note that any reboot you perform on your own will not alleviate the need for this maintenance. However, you can complete this maintenance at a time of your choosing by stopping and restarting your instance at any time prior to the scheduled maintenance window. Please note that the data on your local instance-store volume will not be persisted if you stop and start your instance. Additional information about maintenance events can be found at http://docs.amazonwebservices.com/AWSEC2/latest/UserGuide/monitoring-instances-status-check_sched.html We perform maintenance regularly to ensure the EC2 service runs stably and securely for our customers. If you have any questions or concerns, you can contact the AWS Support Team on the community forums and via AWS Premium Support at: http://aws.amazon.com/support
What is required?
A support ticket should be created to:
- Investigate the affected resource and to schedule the maintenance window for a suitable time where little/no activity is taking place on the system.
- To monitor this maintenance when it is taking place.
- To ensure all functionality is restored following the maintenance.
An example Mop is available: https://docs.google.com/document/d/1JMlcyXAWiKuG6AYINDvIsgNhAg3_p6YryKAwHDFNYrU/edit
Main Process
This will change depending on environment but the main approach will remain the same.
- Notify Customer of Maintenance
- Update Ansible and related Projects
- Comment Out Watchdogs and silence Prometheus Alerts
- Shutdown Applications
- Stop and Start the Instance
- Start-Up Applications
- Notify Customer Applications are available
- Sanity Test Applications
- Notify Customer Testing is finished
- Check that the scheduled maintenance notification is removed on AWS dashboard
Things to remember
- In AWS Reboot Action != Stopping and starting of an instance.-Stopping and starting instance allows it to be moved - Rebooting does not.
- Dependent on Environment there may be extra actions E.g - On the load balancer we must manually restart nginx for application headers to work
Useful Resources
Some useful resources for RDS maintenance:
databaseandnetworkmanagement/aws_ec2_maintenance.txt · Last modified: 2021/06/25 10:09 (external edit)