User Tools

Site Tools


Writing /app/www/public/data/meta/databaseandnetworkmanagement/replica_restore.meta failed
databaseandnetworkmanagement:replica_restore

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
databaseandnetworkmanagement:replica_restore [2021/03/18 10:40] bosowskidatabaseandnetworkmanagement:replica_restore [2021/06/25 10:09] (current) – external edit 127.0.0.1
Line 1: Line 1:
 +====== Replica Database Restore ======
  
 +==== The below steps are confirmed to work and have been performed on EXTDB2, after the server restarted and the replication was in an uncoverable state. ====
 +
 +===== Tmux =====
 +To ensure all of the lengthy processes run without the session disconnecting and in result terminating the process, use tmux. After logging into the server simply run:
 +<code>
 +tmux
 +</code>
 +
 +Now you will be inside of a tmux session, to which you can return even if you disconnect. All of the processes will be bound to that session and will continue running while it is active.
 +To detach from the session press ''ctrl+b'' following by ''d''
 +You can then return to it with
 +<code>
 +tmux attach
 +</code>
 +
 +===== Dumping =====
 +The below query will give you the total size of all databases, run it on the source DB:
 +<code>
 +select sum(size)
 +   from
 +    (SELECT table_schema AS "Database", ROUND(SUM(data_length + index_length) / 1024 / 1024, 2) AS "size" FROM information_schema.TABLES GROUP BY table_schema) as a;
 +</code>
 +
 +Below command is safe to use on a running primary database. Replace the --size parameter with the output from the above query.
 +<code>
 +mysqldump --all-databases --add-drop-database --allow-keywords --master-data=1 --flush-logs --single-transaction --triggers --routines -u root -p | pv --progress --size <size>m > alldb.dump.sql
 +</code>
 +
 +**Explanation of the --master-data parameter from the docs:**
 +'' --master-data (setting no value or 1 causes a CHANGE MASTER TO statement to be written to the dump, setting 2 causes the statement to be written but encased in SQL comments) and has the same effect as --master-data in terms of enabling or disabling other options and in how locking is handled.
 +'' The parameter essentially sets the binlog file and location automatically to ensure that we can resume replication without issues after restoring the dump.
 +
 +Backup the mysql database from the replica before restoring the dump, since we tend to have different sets of users on the replica than on primary for customer access. Restoring only the mysql.users table doesn't work, hence this.
 +<code>
 +mysqldump -uroot -p mysql > mysql.dump.sql
 +</code>
 +
 +Also backup 
 +<code>
 +mysqldump -uroot -p watchdog> watchdog.dump.sql
 +</code>
 +
 +===== Restoring =====
 +Move over the dump from primary to replica with:
 +<code>
 +rsync -z scotty@extdb1.ext:/backup/alldb.dump.sql scotty@extdb2:/backup/
 +</code>
 +
 +Before attempting to load in the dump, ensure that the `max_allowed_packet` is set to 600M or larger, without this you will run into ''ERROR 2006 (HY000) at line 9077: MySQL server has gone away'' during the restore.
 +
 +You can do this by running the following:
 +<code>show variables like 'max_allowed_packet';</code>
 +
 +You can set this by adding the following to the `my.cnf` file:
 +<code>max_allowed_packet = 600</code>
 +
 +Or with the mysql command for the current session:
 +<code>SET GLOBAL max_allowed_packet=600M</code>
 +
 +Stop the replica with the following command:
 +<code>STOP SLAVE;</code>
 +
 +To load in the dump into the replica, use the following command:
 +<code>
 +pv ./alldb.dump.sql | mysql -uroot -p
 +</code>
 +
 +The above will take a considerable amount of time - for a ~500Gb dump file, this process took ~18 hours on EXT DB2.
 +
 +You can then start the replica and it should start slowly catching up to primary:
 +<code>
 +START SLAVE;
 +</code>
 +
 +During the incident, I have manually extracted the user inserts from the previously taken mysql.dump.sql and added in the missing users that were missing after the primary dump import, but the below should work also:
 +Restore the mysq database on the replica with (restoring only the users table didn't seem to work from testing):
 +<code>
 +mysql -uroot -p mysql< mysql.dump.sql
 +</code>