Thursday, October 11, 2012

Round Robin Replication using GTID

In a previous post I showed how to implement multi-source round-robin replication in pure SQL using the tables that are needed for crash-safe replication. I also outlined a revised version of this approach in the Replication Tips & Tricks presentation I gave at MySQL Connect. This was, however, before the GTID (Global Transaction ID) implementation was done. Now that they are introduced, multi-source replication is even easier since you no longer have to keep track of the positions.

Figure 1. Tables for storing information about masters
CREATE TABLE my_masters (
    idx INT AUTO_INCREMENT,
    host CHAR(50) NOT NULL,
    port INT NOT NULL DEFAULT 3306,
    PRIMARY KEY (idx),
    UNIQUE INDEX (host,port)
);

CREATE TABLE current_master (
    idx INT
);

CREATE TABLE replication_user(
    name CHAR(40),
    passwd CHAR(40)
);

One caveat is that this only works if you are replicating from servers that have GTID enabled, so if you are trying to replicate from a pre-5.6 server, you can use the original implementation. I have added a re-factored version of the code last in this post, and you can also find some utility procedures that I use in the version described here.

For the version that uses GTID, we still keep the two tables we were using in the original implementation around, but we remove the file and position from the tables, giving the definitions seen in Figure 1. Also, the user and password is stored in a separate table and is assumed to be identical for all machines.

To fetch the new master, I created a fetch_next_master procedure that fetches the next master in turn and then advance current_master to the next master. The second select in the code below is used to handle the case that you have a table with masters defined as in Table 1.


delimiter $$
CREATE PROCEDURE fetch_next_master(
    OUT p_host CHAR(50), OUT p_port INT UNSIGNED,
    OUT p_file CHAR(50), OUT p_pos BIGINT)
BEGIN
   DECLARE l_next_idx INT DEFAULT 1;

   SELECT idx INTO l_next_idx FROM my_masters
    WHERE idx > (SELECT idx FROM current_master)
    ORDER BY idx LIMIT 1;

   SELECT idx INTO l_next_idx FROM my_masters
    WHERE idx >= l_next_idx
    ORDER BY idx LIMIT 1;

   UPDATE current_master SET idx = l_next_idx;

   SELECT host, port INTO p_host, p_port
     FROM my_masters WHERE idx = l_next_idx;
END $$
delimiter ;

Since we no longer need to save the position, the code for multi_source event is significantly simpler. All that is necessary is to change master to the next master in turn: the server remembers what transactions are missing automatically and will start replicating from the correct position.

delimiter $$
CREATE EVENT multi_source
    ON SCHEDULE EVERY 1 MINUTE DO
BEGIN
   DECLARE l_host CHAR(50);
   DECLARE l_port INT UNSIGNED;
   DECLARE l_user CHAR(40);
   DECLARE l_passwd CHAR(40);
   DECLARE l_file CHAR(50);
   DECLARE l_pos BIGINT;

   SET SQL_LOG_BIN = 0;

   CALL stop_slave_gracefully();
   START TRANSACTION;
   CALL fetch_next_master(l_host, l_port);
   SELECT name, passwd INFO l_user, l_passwd FROM replication_user;
   CALL change_master(l_host, l_port, l_user, l_passwd);
   COMMIT;
   START SLAVE;
END $$
delimiter ;

Full code for original implementation


Here is the code for replicating from pre-5.6 to 5.6 using the replication tables added for implementing crash-safe slaves.

Compared to the version described in the earlier post, I have added a few utility procedures such as a procedure to stop the slave gracefully. The procedure will first stop the I/O thread, and then empty the relay log before stopping the SQL thread. This is mainly to avoid having to re-transfer a lot of events from the master. Compared to the version provided in the previous post, I factored out some separate procedures. You can see the re-factored version last in the post.


delimiter $$
CREATE PROCEDURE change_master(
    p_host CHAR(40), p_port INT,
    p_user CHAR(40), p_passwd CHAR(40),
    p_file CHAR(40), p_pos LONG)
BEGIN
   SET @cmd = CONCAT('CHANGE MASTER TO ',
                     CONCAT('MASTER_HOST = "', p_host, '", '),
                     CONCAT('MASTER_PORT = ', p_port, ', '),
                     CONCAT('MASTER_USER = "', p_user, '", '),
                     CONCAT('MASTER_PASSWORD = "', p_passwd, '"'));

   IF p_file IS NOT NULL AND p_pos IS NOT NULL THEN
     SET @cmd = CONCAT(@cmd,
                       CONCAT(', MASTER_LOG_FILE = "', p_file, '"'),
                       CONCAT(', MASTER_LOG_POS = ', p_pos));
   END IF;
   PREPARE change_master FROM @cmd;
   EXECUTE change_master;
   DEALLOCATE PREPARE change_master;
END $$
delimiter ;

delimiter $$
CREATE PROCEDURE save_position()
BEGIN
   DECLARE l_idx INT UNSIGNED;
   DECLARE l_msg CHAR(60);

   UPDATE my_masters AS m,
          mysql.slave_relay_log_info AS rli
      SET m.log_pos = rli.master_log_pos,
          m.log_file = rli.master_log_name
    WHERE idx = (SELECT idx FROM current_master);
END $$
delimiter ;

delimiter $$
CREATE PROCEDURE fetch_next_master(
    OUT p_host CHAR(40), OUT p_port INT UNSIGNED,
    OUT p_file CHAR(40), OUT p_pos BIGINT)
BEGIN
   DECLARE l_next_idx INT DEFAULT 1;

   SELECT idx INTO l_next_idx FROM my_masters
    WHERE idx > (SELECT idx FROM current_master)
    ORDER BY idx LIMIT 1;

   SELECT idx INTO l_next_idx FROM my_masters
    WHERE idx >= l_next_idx
    ORDER BY idx LIMIT 1;

   UPDATE current_master SET idx = l_next_idx;

   SELECT host, port, log_pos, log_file
     INTO p_host, p_port, p_pos, p_file
     FROM my_masters
    WHERE idx = l_next_idx;
END $$
delimiter ;

delimiter $$
CREATE EVENT multi_source
    ON SCHEDULE EVERY 10 SECOND DO
BEGIN
   DECLARE l_host CHAR(40);
   DECLARE l_port INT UNSIGNED;
   DECLARE l_user CHAR(40);
   DECLARE l_passwd CHAR(40);
   DECLARE l_file CHAR(40);
   DECLARE l_pos BIGINT;

   SET SQL_LOG_BIN = 0;

   STOP SLAVE;
   START TRANSACTION;
   CALL save_position();
   CALL fetch_next_master(l_host, l_port, l_file, l_pos);
   SELECT name, passwd INTO l_user, l_passwd FROM replication_user;
   CALL change_master(l_host, l_port, l_user, l_passwd, l_file, l_pos);
   COMMIT;
   START SLAVE;
END $$
delimiter ;