Friday, December 18, 2009

MySQL Replicant: a library for controlling replication deployments

Keeping a MySQL installation up and running can be quite tricky at times, especially when having many servers to manage and monitor. In the replication tutorials at the annual MySQL Users' Conference, we demonstrate how to set up replication appropriately and also how to handle various issues that can arise. Many of these procedures are routine: bring down the server, edit the configuration file, bring the server up again, start a mysql client and add a user, etc.

It has always annoyed me that these procedures are perfect candidates for automation, but that we do not have the necessary interfaces to manipulate an entire installation of MySQL servers.

If there were an interface with a relatively small set of primitives—re-directing servers, bringing servers down, add a line to the configuration file, etc.—it would be possible to create pre-canned procedures that can just be executed.

To that end, I started writing on a library that would provide an interface like this. Although more familiar with Perl, Python was picked for this project, since it seems to be widely used by many database administrators (it's just a feeling I have, I have no figures to support it) and just to have a cool name on the library, we call it MySQL Replicant and it is (of course) available at Launchpad.

So what do we want to achieve with having a library like this? Well... the goal is to to provide an generic interface to complete installations and thereby make administration of large installations easy.

By providing such an interface, it will allow description of procedures in an executable format, namely as Python scripts.

In addition to making it easy to implement common tasks for experienced database administrators, it also promotes sharing by providing a way to write complete scripts for solving common problems. Having a pool of such scripts makes it easier for newcomers to get up and running.

The basic idea is that you create a model of the installation on a computer and then manipulate the model. When doing these manipulations, the appropriate commands—either as SQL commands to a running server or shell commands to the host where the server is running—will then be sent to the servers in the installation to configure them correctly.

So, to take small example, how does the code for re-directing a bunch of servers to a master look?

import mysqlrep, my_servers
for slave in my_server.slaves:
   mysqlrep.change_master(slave, my_servers.master)
In this case, the installation is defined in a separate file and is imported as a Python module. Right now, the interface for specifying a topology is quite rough, but this is going to change.
from mysqlrep import Server, User, Linux

servers = [Server(server_id=1, host="",
                  sql_user=User("mysql_replicant", "xyzzy"),
           Server(server_id=2, host="",
                  sql_user=User("mysql_replicant", "xyzzy"),
           Server(server_id=3, host="",
                  sql_user=User("mysql_replicant", "xyzzy"),
           Server(server_id=4, host="",
                  sql_user=User("mysql_replicant", "xyzzy"),
master = servers[0]
slaves = servers[1:]
Here, the Server class represents a server and to be able to do it's job, it is necessary to have one MySQL account on the server and one shell account on the host machine. Right now, it is also necessary to specify the server ID, but the plan is to just require the host, port, socket, SQL account name, and SSH account information. The remaining information can then be fetched from the configuration file of the server. Each server have a small set of primitives on top of which everything else is built:
Server.sql(SQL command)
Execute the SQL command and return a result set.
Server.ssh(command list)
Execute the command given by the command list return an iterator to the result output.
Start the server
Stop the server.
There is a small set of commands defined on top of these primitives that can be used. Here is a list of just a few of them, but there are some more in the library at Launchpad.
change_master(slave, master, position=None)
Change the master of slave to be master and start replicating from position.
Fetch the master position of server, which is the position where the last executed statement ends in the binary log.
Fetch the slave position of server, which is the position where the last executed event ends.
Flush all tables on server and lock the database for read.
Unlock a previously locked database.
Using these primitives, it is easy to clone a master by executing the code below. For this example, I use the quite naive method of backing up a database by creating an archive of the database files and copying them to the new slave.
from mysqlrep import flush_and_lock_database, fetch_master_position
from subprocess import call

position = fetch_master_position(master)
master.ssh("tar Pzcf " + backup_name + " /usr/var/mysql")
call(["scp", + ":" + backup_name, + ":."])
slave.ssh("tar Pzxf " + backup_name + " /usr/var/mysql")
What do you think? Would this be a valuable project to pursue? Here are some links related to this post:

Thursday, December 17, 2009

Using mysqld_multi on Karmic

I wanted to set up several servers on my machine using the Ubuntu distribution and control them using mysqld_multi: the typical way to manage several servers on your machine. However, I also wanted to use MySQL 5.1 and not 5.0, which is the default on Jaunty (Ubuntu 9.04). About a month ago, I upgraded to Karmic Koala and one of the reasons were that MySQL 5.1 is used by default. Even though I could install the latest revision all the time, I usually want to use the real distributions for my private projects for a number of reasons.

I actually tried to upgrade to MySQL 5.1 on Ubuntu 9.04, but I discovered that all kinds of applications had dependencies on MySQL 5.0, so I avoided to upgrade at that time.

Anyway, the procedure for installing multiple servers on the same machine is this:

  1. Shut down the running server.

    This is, strictly speaking, not necessary unless you are going to edit the options for the running server, but I do this as a precaution.

  2. Edit your my.cnf configuration file and add sections for mysqld_multi and the new servers.

    I wanted to add four servers to play with, not counting the one that is already installed and running, so I added sections mysqld1 to mysqld4. Also add a section for mysqld_multi

  3. Create server directories and database files using mysql_install_db

    The new servers need to be bootstrapped so that they have all the necessary databases and tables set up.

  4. Optionally: install an init.d script that uses mysqld_multi.

    This is currently not very well-supported in Debian (there is actually a comment saying that it is not supported), so I skipped this step. If you feel adventerous, you can always copy the /usr/share/mysql/mysqld_multi.server as /etc/init.d/mysql.server as they suggest in the file, but I will not do it, nor recommend it (because I haven't tried it).

  5. Start the installed server(s).

    Well, not much to say here.

So, on my way, I edited the /etc/mysql/my.cnf and added the sections necessary. (You can see a diff of that below.)

The important options to add are server-id so that each server gets a unique server id (I'm going to replicate between them), port and socket so that you can connect to each of them both when you're on the local machine and from another machine, and pid-file to give each server a unique pid file name (this is important, since the default will not work at all).

Next step is to install the data directories for the servers, which should be trivial:

$ sudo mysql_install_db --user=mysql --datadir=/var/lib/mysqlfoo --basedir=/usr
Installing MySQL system tables...
091120  9:40:23 [Warning] Can't create test file /var/lib/mysqlfoo/romeo.lower-test
091120  9:40:23 [Warning] Can't create test file /var/lib/mysqlfoo/romeo.lower-test
ERROR: 1005  Can't create table 'db' (errno: 13)
091120  9:40:23 [ERROR] Aborting

091120  9:40:23 [Warning] Forcing shutdown of 2 plugins
091120  9:40:23 [Note] /usr/sbin/mysqld: Shutdown complete

Installation of system tables failed!  Examine the logs in
/var/lib/mysqlfoo for more information.
OK, the warning is a warning, but it seems I forgot the permissions on the directory. Checking the write permissions, no problems. Hmmm... checking that I can create the directories and files manually as the mysql user, no problems(!)

What on earth is going on?

After some digging around, I found bug #201799 which quite clearly explains that what I thought was a permission problem is actually AppArmor doing its job.

So updating the AppArmor configuration file /etc/apparmor.d/usr.sbin.mysqld with this solved the problem and I could get on with installing the servers.

diff --git a/apparmor.d/usr.sbin.mysqld b/apparmor.d/usr.sbin.mysqld
index f9f1a37..7a94861 100644
--- a/apparmor.d/usr.sbin.mysqld
+++ b/apparmor.d/usr.sbin.mysqld
@@ -21,10 +25,20 @@
   /etc/mysql/my.cnf r,
   /usr/sbin/mysqld mr,
   /usr/share/mysql/** r,
   /var/log/mysql.log rw,
   /var/log/mysql.err rw,
+  /var/log/mysql[1-9].log rw,
+  /var/log/mysql[1-9].err rw,
   /var/lib/mysql/ r,
   /var/lib/mysql/** rwk,
+  /var/lib/mysql[1-9]/ r,
+  /var/lib/mysql[1-9]/** rwk,
   /var/log/mysql/ r,
   /var/log/mysql/* rw,
+  /var/log/mysql[1-9]/ r,
+  /var/log/mysql[1-9]/* rw,
   /var/run/mysqld/ w,
   /var/run/mysqld/mysqld.sock w,
+  /var/run/mysqld/mysqld[1-9].pid w,
+  /var/run/mysqld/mysqld[1-9].sock w,

Changes to /etc/mysql/my.cnf

Here is a unified diff of the changes I made to /etc/mysql/my.cnf to add some more servers.
$ git diff mysql/my.cnf
--- a/mysql/my.cnf
+++ b/mysql/my.cnf
@@ -111,7 +111,46 @@ max_binlog_size         = 100M
 # ssl-cert=/etc/mysql/server-cert.pem
 # ssl-key=/etc/mysql/server-key.pem
+mysqld         = /usr/bin/mysqld_safe
+mysqladmin     = /usr/bin/mysqladmin
+user           = root
+server-id      = 1
+pid-file = /var/run/mysqld/
+socket  = /var/run/mysqld/mysqld1.sock
+port  = 3307
+datadir = /var/lib/mysql1
+log-bin        = /var/lib/mysql1/mysqld1-bin.log
+log-bin-index  = /var/lib/mysql1/mysqld1-bin.index
+server-id      = 2
+pid-file = /var/run/mysqld/
+socket  = /var/run/mysqld/mysqld2.sock
+port  = 3308
+datadir = /var/lib/mysql2
+log-bin        = /var/lib/mysql2/mysqld2-bin.log
+log-bin-index  = /var/lib/mysql2/mysqld2-bin.index
+server-id      = 3
+pid-file = /var/run/mysqld/
+socket  = /var/run/mysqld/mysqld3.sock
+port  = 3309
+datadir = /var/lib/mysql3
+log-bin        = /var/lib/mysql3/mysqld3-bin.log
+log-bin-index  = /var/lib/mysql3/mysqld3-bin.log
+server-id      = 4
+pid-file = /var/run/mysqld/
+socket  = /var/run/mysqld/mysqld4.sock
+port  = 3310
+datadir = /var/lib/mysql4
+log-bin        = /var/lib/mysql4/mysqld3-bin.log
+log-bin-index  = /var/lib/mysql4/mysqld3-bin.log