Chapter 15. Replication

Table of Contents

15.1. Replication Configuration
15.1.1. How to Set Up Replication
15.1.2. Replication Startup Options and Variables
15.1.3. Common Replication Administration Tasks
15.2. Replication Solutions
15.2.1. Using Replication for Backups
15.2.2. Using Replication with Different Master and Slave Storage Engines
15.2.3. Using Replication for Scale-Out
15.2.4. Replicating Different Databases to Different Slaves
15.2.5. Improving Replication Performance
15.2.6. Switching Masters During Failover
15.2.7. Setting Up Replication Using SSL
15.3. Replication Notes and Tips
15.3.1. Replication Features and Issues
15.3.2. Replication Compatibility Between MySQL Versions
15.3.3. Upgrading a Replication Setup
15.3.4. Replication FAQ
15.3.5. Troubleshooting Replication
15.3.6. How to Report Replication Bugs or Problems
15.4. Replication Implementation Overview
15.4.1. Replication Implementation Details
15.4.2. Replication Relay and Status Files
15.4.3. How Servers Evaluate Replication Rules

Replication enables data from one MySQL database server (called the master) to be replicated to one or more MySQL database servers (slaves). Replication is asynchronous - your replication slaves do not need to be connected permanently to receive updates from the master, which means that updates can occur over long-distance connections and even temporary solutions such as a dial-up service. Depending on the configuration, you can replicate all databases, selected databases and even selected tables within a database.

The target uses for replication in MySQL include:

Replication in MySQL features support for one-way, asynchronous replication, in which one server acts as the master, while one or more other servers act as slaves. This is in contrast to the synchronous replication which is a characteristic of MySQL Cluster (see Chapter 16, MySQL Cluster).

There are a number of solutions available for setting up replication between two servers, but the best method to use depends on the presence of data and the engine types you are using. For more information on the available options, see Section 15.1.1, “How to Set Up Replication”.

Replication is controlled through a number of different options and variables. These control the core operation of the replication, timeouts and the databases and filters that can be applied on databases and tables. For more information on the available options, see Section 15.1.2, “Replication Startup Options and Variables”.

You can use replication to solve a number of different problems, including problems with performance, supporting the backup of different databases and for use as part of a larger solution to alleviate system failures. For information on how to address these issues, see Section 15.2, “Replication Solutions”.

For notes and tips on how different data types and statements are treated during replication, including details of replication features, version compatibility, upgrades, and problems and their resolution, including an FAQ, see Section 15.3, “Replication Notes and Tips”.

Detailed information on the implementation of replication, how replication works, the process and contents of the binary log, background threads and the rules used to decide how statements are recorded and replication, see Section 15.4, “Replication Implementation Overview”.

MySQL Enterprise The MySQL Enterprise Monitor provides numerous advisors that give immediate feedback about replication-related problems. For more information see http://www.mysql.com/products/enterprise/advisors.html.

15.1. Replication Configuration

Replication between servers in MySQL works through the use of the binary logging mechanism. The MySQL instance operating as the master (the source of the database changes) writes updates and changes to the database to the binary log. The information in the binary log is stored in different logging formats according to the database changes being recorded. Slaves are configured to read the binary log from the master and to execute the events in the binary log on the slave's local database.

The Master is dumb in this scenario. Once binary logging has been enabled, all statements are recorded in the binary log. Each slave will receive a copy of the entire contents of the binary log. It is the responsibility of the slave to decide which statements in the binary log should be executed; you cannot configure the master to log only certain events. If you do not specify otherwise, all events in the master binary log are executed on the slave. If required, you can configure the slave to only process events that apply to particular databases or tables.

Slaves keep a record of the binary log file and position within the log file that they have read and processed from the master. This means that multiple slaves can be connected to the master and executing different parts of the same binary log. Because the slaves control this process, individual slaves can be connected and disconnected from the server without affecting the master's operation. Also, because each slave remembers the position within the binary log, it is possible for slaves to be disconnected, reconnect and then 'catch up' by continuing from the recorded position.

Both the master and each slave must be configured with a unique id (using the server-id option). In addition, the slave must be configured with information about the master host name, log file name and position within that file. These details can be controlled from within a MySQL session using the CHANGE MASTER statement. The details are stored within the master.info file.

In this section the setup and configuration required for a replication environment is described, including step-by-step instructions for creating a new replication environment. The major components of this section are:

15.1.1. How to Set Up Replication

This section describes how to set up complete replication of a MySQL server. There are a number of different methods for setting up replication, and the exact method that you use will depend on how you are setting up replication, and whether you already have data within your master database.

There are some generic tasks which may be required for all replication setups:

Once you have configured the basic options, you will need to follow the instructions for your replication setup. A number of alternatives are provided:

If you want to administer a MySQL replication setup, we suggest that you read this entire chapter through and try all statements mentioned in Section 12.6.1, “SQL Statements for Controlling Master Servers”, and Section 12.6.2, “SQL Statements for Controlling Slave Servers”. You should also familiarize yourself with the replication startup options described in Section 15.1.2, “Replication Startup Options and Variables”.

Note

Note that certain steps within the setup process require the SUPER privilege. If you do not have this privilege then enabling replication may not be possible.

15.1.1.1. Creating a User for Replication

Each Slave must connect to the Master using a standard username and password. The user that you use for this operation can be any user, providing they have been granted the REPLICATION SLAVE privilege.

You do not need to create a specific user for replication. However, you should be aware that the username and password will be stored in plain text within the master.info file. Therefore you may want to create a user that only has privileges for the replication process.

To create a user or grant an existing user the privileges required for replication use the GRANT statement. If you create a user solely for the purposes of replication then that user only needs the REPLICATION SLAVE privilege. For example, to create a user, repl, that allows all hosts within the domain mydomain.com to connect for replication:

mysql> GRANT REPLICATION SLAVE ON *.*
    -> TO 'repl'@'%.mydomain.com' IDENTIFIED BY 'slavepass';

See Section 12.5.1.3, “GRANT Syntax”, for more information on the GRANT statement.

You may wish to create a different user for each slave, or use the same user for each slave that needs to connect. As long as each user that you want to use for the replication process has the REPLICATION SLAVE privilege you can create as many users as you require.

15.1.1.2. Setting the Replication Master Configuration

For replication to work you must enable binary logging on the master. If binary logging is not enabled, replication will not be possible as it is the binary log that is used to exchange data between the master and slaves.

Each server within a replication group must have a unique server-id. The server-id is used to identify individual servers within the group, and must be positive integer between 1 and (232)-1). How you organize and select the numbers is entirely up to you.

To configure both these options you will need to shut down your MySQL server and edit the configuration of the my.cnf or my.ini file.

You will need to add the following options to the configuration file within the [mysqld] section. If these options already exist, but are commented out, uncomment the options and alter them according to your needs. For example, to enable binary logging, using a log filename prefix of mysql-bin, and setting a server ID of 1:

[mysqld]
log-bin=mysql-bin
server-id=1

Note

For the greatest possible durability and consistency in a replication setup using InnoDB with transactions, you should use innodb_flush_log_at_trx_commit=1 and sync_binlog=1 in the master my.cnf file.

Note

Ensure that the skip-networking option has not been enabled on your replication master. If networking has been disabled, then your slave will not able to communicate with the master and replication will fail.

15.1.1.3. Setting the Replication Slave Configuration

The only option you must configure on the slave is to set the unique server ID. If this option is not already set, or the current value conflicts with the value that you have chosen for the master server, then you should shut down your slave server, and edit the configuration to specify the server id. For example:

[mysqld]
server-id=2

If you are setting up multiple slaves, each one must have a unique server-id value that differs from that of the master and from each of the other slaves. Think of server-id values as something similar to IP addresses: These IDs uniquely identify each server instance in the community of replication partners.

If you do not specify a server-id value, it is set to 1 if you have not defined master-host; otherwise it is set to 2. Note that in the case of server-id omission, a master refuses connections from all slaves, and a slave refuses to connect to a master. Thus, omitting server-id is good only for backup with a binary log.

You do not have to enable binary logging on the slave for replication to be enabled. However, if you enable binary logging on the slave then you can use the binary log for data backups and crash recovery on the slave, and also use the slave as part of a more complex replication topology.

15.1.1.4. Obtaining the Master Replication Information

To configure replication on the slave you must determine the masters current point within the master binary log. You will need this information so that when the slave starts the replication process, it is able to start processing events from the binary log at the correct point.

If you have existing data on your master that you want to synchronize on your slaves before starting the replication process, then you must stop processing statements on the master, obtain the current position, and then dump the data, before allowing the master to continue executing statements. If you do not stop the execution of statements then the data dump, the master status information that you use will not match and you will end up with inconsistent or corrupted databases on the slaves.

To get the master status information, follow these steps:

  1. Start the command line client and flush all tables and block write statements by executing the FLUSH TABLES WITH READ LOCK statement:

    mysql> FLUSH TABLES WITH READ LOCK;

    For InnoDB tables, note that FLUSH TABLES WITH READ LOCK also blocks COMMIT operations.

    Warning

    Leave the client from which you issued the FLUSH TABLES statement running so that the read lock remains in effect. If you exit the client, the lock is released.

  2. Use the SHOW MASTER STATUS statement to determine the current binary log name and offset on the master:

    mysql > SHOW MASTER STATUS;
    +---------------+----------+--------------+------------------+
    | File          | Position | Binlog_Do_DB | Binlog_Ignore_DB |
    +---------------+----------+--------------+------------------+
    | mysql-bin.003 | 73       | test         | manual,mysql     |
    +---------------+----------+--------------+------------------+
    

    The File column shows the name of the log and Position shows the offset within the file. In this example, the binary log file is mysql-bin.003 and the offset is 73. Record these values. You need them later when you are setting up the slave. They represent the replication coordinates at which the slave should begin processing new updates from the master.

    If the master has been running previously without binary logging enabled, the log name and position values displayed by SHOW MASTER STATUS or mysqldump --master-data will be empty. In that case, the values that you need to use later when specifying the slave's log file and position are the empty string ('') and 4.

You now have the information you need to enable the slave to start reading from the binary log in the correct place to start replication.

If you have existing data that needs be to synchronized with the slave before you start replication, leave the client running so that the lock remains in place and then proceed to Section 15.1.1.5, “Creating a Data Snapshot Using mysqldump, or Section 15.1.1.6, “Creating a Data Snapshot Using Raw Data Files”.

If you are setting up a brand new master and slave replication group, then you can exit the client and release the locks.

15.1.1.5. Creating a Data Snapshot Using mysqldump

One way to create a snapshot of the data in an existing master database is to use the mysqldump tool. Once the data dump has been completed, you then import this data into the slave before starting the replication process.

To obtain a snapshot of the data using mysqldump:

  • If you haven't already locked the tables on the server to prevent queries that update data from executing:

    Start the command line client and flush all tables and block write statements by executing the FLUSH TABLES WITH READ LOCK statement:

    mysql> FLUSH TABLES WITH READ LOCK;

    Remember to use SHOW MASTER STATUS and record the binary log details for use when starting up the slave. The point in time of your snapshot and the binary log position must match. See Section 15.1.1.4, “Obtaining the Master Replication Information”.

  • In another session, use mysqldump to create a dump either of all the databases you want to replicate, or by selecting specific databases individually. For example:

    shell> mysqldump --all-databases --lock-all-tables >dbdump.db
  • An alternative to using a bare dump, is to use the --master-data option, which will automatically append the CHANGE MASTER statement required on the slave to start the replication process.

    shell> mysqldump --all-databases --master-data >dbdump.db

When choosing databases to include in the dump, remember that you will need to filter out databases on each slave that you do not want to include in the replication process.

You will need either to copy the dump file to the slave, or to use the file from the master when connecting remotely to the slave to import the data.

15.1.1.6. Creating a Data Snapshot Using Raw Data Files

If your database is particularly large then copying the raw data files may be more efficient than using mysqldump and importing the file on each slave.

However, using this method with tables in storage engines with complex caching or logging algorithms may not give you a perfect “in time” snapshot as cache information and logging updates may not have been applied, even if you have acquired a global read lock. How the storage engine responds to this depends on its crash recovery abilities.

For example, if you are using InnoDB tables, you should use the InnoDB Hot Backup tool to obtain a consistent snapshot. This tool records the log name and offset corresponding to the snapshot to be later used on the slave. Hot Backup is a non-free (commercial) tool that is not included in the standard MySQL distribution. See the InnoDB Hot Backup home page at http://www.innodb.com/hot-backup for detailed information.

Otherwise, you can obtain a reliable binary snapshot of InnoDB tables only after shutting down the MySQL Server.

To create a raw data snapshot of MyISAM tables you can use standard copy tools such as cp or copy, a remote copy tool such as scp or rsync an archiving tool such as zip or tar, or a file system snapshot tool such as dump, providing that your MySQL data files exist on a single filesystem. If you are only replicating certain databases then make sure you only copy those files that related to those tables. (For InnoDB, all tables in all databases are stored in a single file unless you have the innodb_file_per_table option enabled.)

You may want to specifically exclude the following files from your archive:

  • Files relating to the mysql database.

  • The master.info file.

  • The master's binary log files.

  • Any relay log files.

To get the most consistent results with a raw data snapshot you should shut down the server during the process, as below:

  1. Acquire a read lock and get the master's status. See Section 15.1.1.4, “Obtaining the Master Replication Information”.

  2. In a separate session, shut down the MySQL server:

    shell> mysqladmin shutdown
    
  3. Take a copy of the MySQL data files. Examples are shown below for common solutions - you need to choose only one of these solutions:

    shell> tar cf /tmp/db.tar ./data
    shell> zip -r /tmp/db.zip ./data
    shell> rsync --recursive ./data /tmp/dbdata
    
  4. Start up the MySQL instance on the master.

If you are not using InnoDB tables, you can get a snapshot of the system from a master without shutting down the server as described in the following steps:

  1. Acquire a read lock and get the master's status. See Section 15.1.1.4, “Obtaining the Master Replication Information”.

  2. Take a copy of the MySQL data files. Examples are shown below for common solutions - you need to choose only one of these solutions:

    shell> tar cf /tmp/db.tar ./data
    shell> zip -r /tmp/db.zip ./data
    shell> rsync --recursive ./data /tmp/dbdata
    
  3. In the client where you acquired the read lock, free the lock:

    mysql> UNLOCK TABLES;
    

Once you have created the archive or copy of the database, you will need to copy the files to each slave before starting the slave replication process.

15.1.1.7. Setting Up Replication with New Master and Slaves

Setting up replication with a new Master and Slaves (i.e. with no existing data) is the easiest and most straightforward method for setting up replication.

You can also use this method if you are setting up new servers and have an existing dump of the databases that you want to load into your replication configuration. By loading the data onto a new master, the data will be automatically replicated to the slaves.

To set up replication between a new master and slave:

  1. Configure the MySQL master with the necessary configuration properties. See Section 15.1.1.2, “Setting the Replication Master Configuration”.

  2. Start up the MySQL master.

  3. Setup a user, see Section 15.1.1.1, “Creating a User for Replication”.

  4. Obtain the master status information. See Section 15.1.1.4, “Obtaining the Master Replication Information”.

  5. Free the read lock:

    mysql> UNLOCK TABLES;
  6. On the slave, edit the MySQL configuration. See Section 15.1.1.3, “Setting the Replication Slave Configuration”.

  7. Start up the MySQL slave.

  8. Execute the CHANGE MASTER command to set the master replication server configuration.

Because there is no data to load or exchange on a new server configuration you do not need to copy or import any information.

If you are setting up a new replication environment using the data from an existing database server, you will now need to run the dump file on the master. The database updates will automatically be propagated to the slaves:

shell> mysql -h master < fulldb.dump

15.1.1.8. Setting Up Replication with Existing Data

When setting up replication with existing data, you will need to decide how best to get the data from the master to the slave before starting the replication service.

The basic process for setting up replication with existing data is as follows:

  1. If you have not already configured the server-id and binary logging, you will need to shut down your master to configure these options. See Section 15.1.1.2, “Setting the Replication Master Configuration”.

    If you have to shut down your master database, then this is a good opportunity to take a snapshot of the database. You should obtain the master status (see Section 15.1.1.4, “Obtaining the Master Replication Information”) before taking the database down, updating the configuration and taking a snapshot. For information on how to create a snapshot using raw data files, see Section 15.1.1.6, “Creating a Data Snapshot Using Raw Data Files”.

  2. If your server is already correctly configured, obtain the master status (see Section 15.1.1.4, “Obtaining the Master Replication Information”) and then use mysqldump to take a snapshot (see Section 15.1.1.5, “Creating a Data Snapshot Using mysqldump) or take a raw snapshot of the live database using the guide in Section 15.1.1.6, “Creating a Data Snapshot Using Raw Data Files”.

  3. With the MySQL master running, create a user to be used by the slave when connecting to the master during replication. See Section 15.1.1.1, “Creating a User for Replication”.

  4. Update the configuration of the slave, see Section 15.1.1.3, “Setting the Replication Slave Configuration”.

  5. The next step depends on how you created the snapshot of data on the master.

    If you used mysqldump:

    1. Startup the slave, skipping replication by using the --skip-slave option.

    2. Import the dump file:

      shell> mysql < fulldb.dump

    If you created a snapshot using the raw data files:

    1. Extract the data files into your slave data directory. For example:

      shell> tar xvf dbdump.tar

      You may need to set permissions and ownership on the files to match the configuration of your slave.

    2. Startup the slave, skipping replication by using the --skip-slave option.

  6. Configure the slave with the master status information. This will tell the slave the binary log file and position within the file where replication needs to start, and configure the login credentials and hostname of the master. For more information on the statement required, see Section 15.1.1.10, “Setting the Master Configuration on the Slave”.

  7. Start the slave threads:

    mysql> START SLAVE;
    

After you have performed this procedure, the slave should connect to the master and catch up on any updates that have occurred since the snapshot was taken.

If you have forgotten to set the server-id option for the master, slaves cannot connect to it.

If you have forgotten to set the server-id option for the slave, you get the following error in the slave's error log:

Warning: You should set server-id to a non-0 value if master_host
is set; we will force server id to 2, but this MySQL server will
not act as a slave.

You also find error messages in the slave's error log if it is not able to replicate for any other reason.

Once a slave is replicating, you can find in its data directory one file named master.info and another named relay-log.info. The slave uses these two files to keep track of how much of the master's binary log it has processed. Do not remove or edit these files unless you know exactly what you are doing and fully understand the implications. Even in that case, it is preferred that you use the CHANGE MASTER TO statement to change replication parameters. The slave will use the values specified in the statement to update the status files automatically.

Note

The content of master.info overrides some of the server options specified on the command line or in my.cnf. See Section 15.1.2, “Replication Startup Options and Variables”, for more details.

Once you have a snapshot of the master, you can use it to set up other slaves by following the slave portion of the procedure just described. You do not need to take another snapshot of the master; you can use the same one for each slave.

15.1.1.9. Introducing Additional Slaves to an Existing Replication Environment

If you want to add another slave to the existing replication configuration then you can do so without stopping the master. Instead, you duplicate the settings on the slaves.

To duplicate the slave:

  1. Shut down the existing slave (slavea):

    shell> mysqladmin shutdown
  2. Copy the data directory from the existing slave to the new slave. You can do this by creating an archive using tar or WinZip, or by performing a direct copy using a tool such as cp or rsync. Ensure you also copy the log files and relay log files.

  3. Copy the master.info and relay.info files from the existing slave to the new slave. These files hold the current log positions.

  4. Start the existing slave.

  5. On the new slave, edit the configuration and the give the new slave a new unique server-id.

  6. Start the new slave; the master.info file options will be used to start the replication process.

15.1.1.10. Setting the Master Configuration on the Slave

To set up the slave to communicate with the master for replication, you must tell the slave the necessary connection information. To do this, execute the following statement on the slave, replacing the option values with the actual values relevant to your system:

mysql> CHANGE MASTER TO
    ->     MASTER_HOST='master_host_name',
    ->     MASTER_USER='replication_user_name',
    ->     MASTER_PASSWORD='replication_password',
    ->     MASTER_LOG_FILE='recorded_log_file_name',
    ->     MASTER_LOG_POS=recorded_log_position;

Note

Replication cannot use Unix socket files. You must be able to connect to the master MySQL server using TCP/IP.

The following table shows the maximum allowable length for the string-valued options:

MASTER_HOST60
MASTER_USER16
MASTER_PASSWORD32
MASTER_LOG_FILE255

15.1.2. Replication Startup Options and Variables

This section describes the options that you can use on slave replication servers. You can specify these options either on the command line or in an option file.

On the master and each slave, you must use the server-id option to establish a unique replication ID. For each server, you should pick a unique positive integer in the range from 1 to 232 – 1, and each ID must be different from every other ID. Example: server-id=3

Options that you can use on the master server for controlling binary logging are described in Section 5.11.3, “The Binary Log”.

Some slave server replication options are handled in a special way, in the sense that each is ignored if a master.info file exists when the slave starts and contains a value for the option. The following options are handled this way:

  • --master-host

  • --master-user

  • --master-password

  • --master-port

  • --master-connect-retry

  • --master-ssl

  • --master-ssl-ca

  • --master-ssl-capath

  • --master-ssl-cert

  • --master-ssl-cipher

  • --master-ssl-key

The master.info file format in MySQL 5.0 includes values corresponding to the SSL options. In addition, the file format includes as its first line the number of lines in the file. (See Section 15.4.2, “Replication Relay and Status Files”.) If you upgrade an older server (before MySQL 4.1.1) to a newer version, the new server upgrades the master.info file to the new format automatically when it starts. However, if you downgrade a newer server to an older version, you should remove the first line manually before starting the older server for the first time.

If no master.info file exists when the slave server starts, it uses the values for those options that are specified in option files or on the command line. This occurs when you start the server as a replication slave for the very first time, or when you have run RESET SLAVE and then have shut down and restarted the slave.

If the master.info file exists when the slave server starts, the server uses its contents and ignores any options that correspond to the values listed in the file. Thus, if you start the slave server with different values of the startup options that correspond to values in the master.info file, the different values have no effect, because the server continues to use the master.info file. To use different values, you must either restart after removing the master.info file or (preferably) use the CHANGE MASTER TO statement to reset the values while the slave is running.

Suppose that you specify this option in your my.cnf file:

[mysqld]
master-host=some_host

The first time you start the server as a replication slave, it reads and uses that option from the my.cnf file. The server then records the value in the master.info file. The next time you start the server, it reads the master host value from the master.info file only and ignores the value in the option file. If you modify the my.cnf file to specify a different master host of some_other_host, the change still has no effect. You should use CHANGE MASTER TO instead.

MySQL Enterprise For expert advice regarding master startup options subscribe to the MySQL Enterprise Monitor. For more information see http://www.mysql.com/products/enterprise/advisors.html.

Because the server gives an existing master.info file precedence over the startup options just described, you might prefer not to use startup options for these values at all, and instead specify them by using the CHANGE MASTER TO statement. See Section 12.6.2.1, “CHANGE MASTER TO Syntax”.

This example shows a more extensive use of startup options to configure a slave server:

[mysqld]
server-id=2
master-host=db-master.mycompany.com
master-port=3306
master-user=pertinax
master-password=freitag
master-connect-retry=60
report-host=db-slave.mycompany.com

The following list describes startup options for controlling replication. Many of these options can be reset while the server is running by using the CHANGE MASTER TO statement. Others, such as the --replicate-* options, can be set only when the slave server starts.

  • --log-slave-updates

    Normally, a slave does not log to its own binary log any updates that are received from a master server. This option tells the slave to log the updates performed by its SQL thread to its own binary log. For this option to have any effect, the slave must also be started with the --log-bin option to enable binary logging. --log-slave-updates is used when you want to chain replication servers. For example, you might want to set up replication servers using this arrangement:

    A -> B -> C
    

    Here, A serves as the master for the slave B, and B serves as the master for the slave C. For this to work, B must be both a master and a slave. You must start both A and B with --log-bin to enable binary logging, and B with the --log-slave-updates option so that updates received from A are logged by B to its binary log.

  • --log-warnings[=level]

    This option causes a server to print more messages to the error log about what it is doing. With respect to replication, the server generates warnings that it succeeded in reconnecting after a network/connection failure, and informs you as to how each slave thread started. This option is enabled by default; to disable it, use --skip-log-warnings. Aborted connections are not logged to the error log unless the value is greater than 1.

  • --master-connect-retry=seconds

    The number of seconds that the slave thread sleeps before trying to reconnect to the master in case the master goes down or the connection is lost. The value in the master.info file takes precedence if it can be read. If not set, the default is 60. Connection retries are not invoked until the slave times out reading data from the master according to the value of --slave-net-timeout. The number of reconnection attempts is limited by the --master-retry-count option.

  • --master-host=host_name

    The hostname or IP number of the master replication server. The value in master.info takes precedence if it can be read. If no master host is specified, the slave thread does not start.

  • --master-info-file=file_name

    The name to use for the file in which the slave records information about the master. The default name is master.info in the data directory.

  • --master-password=password

    The password of the account that the slave thread uses for authentication when it connects to the master. The value in the master.info file takes precedence if it can be read. If not set, an empty password is assumed.

  • --master-port=port_number

    The TCP/IP port number that the master is listening on. The value in the master.info file takes precedence if it can be read. If not set, the compiled-in setting is assumed (normally 3306).

  • --master-retry-count=count

    The number of times that the slave tries to connect to the master before giving up. Reconnects are attempted at intervals set by --master-connect-retry and reconnects are triggered when data reads by the slave time out according to the --slave-net-timeout option. The default value is 86400.

  • --master-ssl, --master-ssl-ca=file_name, --master-ssl-capath=directory_name, --master-ssl-cert=file_name, --master-ssl-cipher=cipher_list, --master-ssl-key=file_name

    These options are used for setting up a secure replication connection to the master server using SSL. Their meanings are the same as the corresponding --ssl, --ssl-ca, --ssl-capath, --ssl-cert, --ssl-cipher, --ssl-key options that are described in Section 5.8.7.3, “SSL Command Options”. The values in the master.info file take precedence if they can be read.

  • --master-user=user_name

    The username of the account that the slave thread uses for authentication when it connects to the master. This account must have the REPLICATION SLAVE privilege. The value in the master.info file takes precedence if it can be read. If the master username is not set, the name test is assumed.

  • --max-relay-log-size=size

    The size at which the server rotates relay log files automatically. For more information, see Section 15.4.2, “Replication Relay and Status Files”. The default size is 1GB.

  • --read-only

    When this option is given, the server allows no updates except from users that have the SUPER privilege or (on a slave server) from updates performed by slave threads. On a slave server, this can be useful to ensure that the slave accepts updates only from its master server and not from clients. As of MySQL 5.0.16, this option does not apply to TEMPORARY tables.

  • --relay-log=file_name

    The basename for the relay log. The default basename is host_name-relay-bin. The server creates relay log files in sequence by adding a numeric suffix to the basename. You can specify the option to create hostname-independent relay log names, or if your relay logs tend to be big (and you don't want to decrease max_relay_log_size) and you need to put them in some area different from the data directory, or if you want to increase speed by balancing load between disks.

  • --relay-log-index=file_name

    The name to use for the relay log index file. The default name is host_name-relay-bin.index in the data directory, where host_name is the name of the slave server.

  • --relay-log-info-file=file_name

    The name to use for the file in which the slave records information about the relay logs. The default name is relay-log.info in the data directory.

  • --relay-log-purge={0|1}

    Disable or enable automatic purging of relay logs as soon as they are not needed any more. The default value is 1 (enabled). This is a global variable that can be changed dynamically with SET GLOBAL relay_log_purge = N.

  • --relay-log-space-limit=size

    This option places an upper limit on the total size in bytes of all relay logs on the slave. A value of 0 means “no limit.” This is useful for a slave server host that has limited disk space. When the limit is reached, the I/O thread stops reading binary log events from the master server until the SQL thread has caught up and deleted some unused relay logs. Note that this limit is not absolute: There are cases where the SQL thread needs more events before it can delete relay logs. In that case, the I/O thread exceeds the limit until it becomes possible for the SQL thread to delete some relay logs, because not doing so would cause a deadlock. You should not set --relay-log-space-limit to less than twice the value of --max-relay-log-size (or --max-binlog-size if --max-relay-log-size is 0). In that case, there is a chance that the I/O thread waits for free space because --relay-log-space-limit is exceeded, but the SQL thread has no relay log to purge and is unable to satisfy the I/O thread. This forces the I/O thread to temporarily ignore --relay-log-space-limit.

  • --replicate-do-db=db_name

    Tell the slave to restrict replication to statements where the default database (that is, the one selected by USE) is db_name. To specify more than one database, use this option multiple times, once for each database. Note that this does not replicate cross-database statements such as UPDATE some_db.some_table SET foo='bar' while having selected a different database or no database.

    Warning

    To specify multiple databases you must use multiple instances of this option. Because database names can contain commas, if you supply a comma separated list then the list will be treated as the name of a single database.

    An example of what does not work as you might expect: If the slave is started with --replicate-do-db=sales and you issue the following statements on the master, the UPDATE statement is not replicated:

    USE prices;
    UPDATE sales.january SET amount=amount+1000;
    

    The main reason for this “just check the default database” behavior is that it is difficult from the statement alone to know whether it should be replicated (for example, if you are using multiple-table DELETE statements or multiple-table UPDATE statements that act across multiple databases). It is also faster to check only the default database rather than all databases if there is no need.

    If you need cross-database updates to work, use --replicate-wild-do-table=db_name.% instead. See Section 15.4.3, “How Servers Evaluate Replication Rules”.

  • --replicate-do-table=db_name.tbl_name

    Tell the slave thread to restrict replication to the specified table. To specify more than one table, use this option multiple times, once for each table. This works for cross-database updates, in contrast to --replicate-do-db. See Section 15.4.3, “How Servers Evaluate Replication Rules”.

  • --replicate-ignore-db=db_name

    Tells the slave to not replicate any statement where the default database (that is, the one selected by USE) is db_name. To specify more than one database to ignore, use this option multiple times, once for each database. You should not use this option if you are using cross-database updates and you do not want these updates to be replicated. See Section 15.4.3, “How Servers Evaluate Replication Rules”.

    MySQL Enterprise For expert advice regarding slave startup options subscribe to the MySQL Enterprise Monitor. For more information see http://www.mysql.com/products/enterprise/advisors.html.

    An example of what does not work as you might expect: If the slave is started with --replicate-ignore-db=sales and you issue the following statements on the master, the UPDATE statement is replicated:

    USE prices;
    UPDATE sales.january SET amount=amount+1000;
    

    Note

    In the preceding example the statement is replicated because --replicate-ignore-db only applies to the default database (set through the USE statement). Because the sales database was specified explicitly in the statement, the statement has not been filtered.

    If you need cross-database updates to work, use --replicate-wild-ignore-table=db_name.% instead. See Section 15.4.3, “How Servers Evaluate Replication Rules”.

  • --replicate-ignore-table=db_name.tbl_name

    Tells the slave thread to not replicate any statement that updates the specified table, even if any other tables might be updated by the same statement. To specify more than one table to ignore, use this option multiple times, once for each table. This works for cross-database updates, in contrast to --replicate-ignore-db. See Section 15.4.3, “How Servers Evaluate Replication Rules”.

  • --replicate-rewrite-db=from_name->to_name

    Tells the slave to translate the default database (that is, the one selected by USE) to to_name if it was from_name on the master. Only statements involving tables are affected (not statements such as CREATE DATABASE, DROP DATABASE, and ALTER DATABASE), and only if from_name is the default database on the master. This does not work for cross-database updates. The database name translation is done before the --replicate-* rules are tested.

    If you use this option on the command line and the “>” character is special to your command interpreter, quote the option value. For example:

    shell> mysqld --replicate-rewrite-db="olddb->newdb"
    
  • --replicate-same-server-id

    To be used on slave servers. Usually you should use the default setting of 0, to prevent infinite loops caused by circular replication. If set to 1, the slave does not skip events having its own server ID. Normally, this is useful only in rare configurations. Cannot be set to 1 if --log-slave-updates is used. Note that by default the slave I/O thread does not even write binary log events to the relay log if they have the slave's server id (this optimization helps save disk usage). So if you want to use --replicate-same-server-id, be sure to start the slave with this option before you make the slave read its own events that you want the slave SQL thread to execute.

  • --replicate-wild-do-table=db_name.tbl_name

    Tells the slave thread to restrict replication to statements where any of the updated tables match the specified database and table name patterns. Patterns can contain the “%” and “_” wildcard characters, which have the same meaning as for the LIKE pattern-matching operator. To specify more than one table, use this option multiple times, once for each table. This works for cross-database updates. See Section 15.4.3, “How Servers Evaluate Replication Rules”.

    Example: --replicate-wild-do-table=foo%.bar% replicates only updates that use a table where the database name starts with foo and the table name starts with bar.

    If the table name pattern is %, it matches any table name and the option also applies to database-level statements (CREATE DATABASE, DROP DATABASE, and ALTER DATABASE). For example, if you use --replicate-wild-do-table=foo%.%, database-level statements are replicated if the database name matches the pattern foo%.

    To include literal wildcard characters in the database or table name patterns, escape them with a backslash. For example, to replicate all tables of a database that is named my_own%db, but not replicate tables from the my1ownAABCdb database, you should escape the “_” and “%” characters like this: --replicate-wild-do-table=my\_own\%db. If you're using the option on the command line, you might need to double the backslashes or quote the option value, depending on your command interpreter. For example, with the bash shell, you would need to type --replicate-wild-do-table=my\\_own\\%db.

  • --replicate-wild-ignore-table=db_name.tbl_name

    Tells the slave thread not to replicate a statement where any table matches the given wildcard pattern. To specify more than one table to ignore, use this option multiple times, once for each table. This works for cross-database updates. See Section 15.4.3, “How Servers Evaluate Replication Rules”.

    Example: --replicate-wild-ignore-table=foo%.bar% does not replicate updates that use a table where the database name starts with foo and the table name starts with bar.

    For information about how matching works, see the description of the --replicate-wild-do-table option. The rules for including literal wildcard characters in the option value are the same as for --replicate-wild-ignore-table as well.

  • --report-host=slave_name

    The hostname or IP number of the slave to be reported to the master during slave registration. This value appears in the output of SHOW SLAVE HOSTS on the master server. Leave the value unset if you do not want the slave to register itself with the master. Note that it is not sufficient for the master to simply read the IP number of the slave from the TCP/IP socket after the slave connects. Due to NAT and other routing issues, that IP may not be valid for connecting to the slave from the master or other hosts.

  • --report-password=password

    The account password of the slave to be reported to the master during slave registration. This value appears in the output of SHOW SLAVE HOSTS on the master server if the --show-slave-auth-info option is given.

  • --report-port=slave_port_num

    The TCP/IP port number for connecting to the slave, to be reported to the master during slave registration. Set this only if the slave is listening on a non-default port or if you have a special tunnel from the master or other clients to the slave. If you are not sure, do not use this option.

  • --report-user=user_name

    The account username of the slave to be reported to the master during slave registration. This value appears in the output of SHOW SLAVE HOSTS on the master server if the --show-slave-auth-info option is given.

  • --show-slave-auth-info

    Display slave usernames and passwords in the output of SHOW SLAVE HOSTS on the master server for slaves started with the --report-user and --report-password options.

  • --skip-slave-start

    Tells the slave server not to start the slave threads when the server starts. To start the threads later, use a START SLAVE statement.

  • --slave_compressed_protocol={0|1}

    If this option is set to 1, use compression for the slave/master protocol if both the slave and the master support it. The default is 0 (no compression).

  • --slave-load-tmpdir=file_name

    The name of the directory where the slave creates temporary files. This option is by default equal to the value of the tmpdir system variable. When the slave SQL thread replicates a LOAD DATA INFILE statement, it extracts the file to be loaded from the relay log into temporary files, and then loads these into the table. If the file loaded on the master is huge, the temporary files on the slave are huge, too. Therefore, it might be advisable to use this option to tell the slave to put temporary files in a directory located in some filesystem that has a lot of available space. In that case, the relay logs are huge as well, so you might also want to use the --relay-log option to place the relay logs in that filesystem.

    The directory specified by this option should be located in a disk-based filesystem (not a memory-based filesystem) because the temporary files used to replicate LOAD DATA INFILE must survive machine restarts. The directory also should not be one that is cleared by the operating system during the system startup process.

  • --slave-net-timeout=seconds

    The number of seconds to wait for more data from the master before the slave considers the connection broken, aborts the read, and tries to reconnect. The first retry occurs immediately after the timeout. The interval between retries is controlled by the --master-connect-retry option and the number of reconnection attempts is limited by the --master-retry-count option. The default is 3600 seconds (one hour).

  • --slave-skip-errors=[err_code1,err_code2,...|all]

    Normally, replication stops when an error occurs on the slave. This gives you the opportunity to resolve the inconsistency in the data manually. This option tells the slave SQL thread to continue replication when a statement returns any of the errors listed in the option value.

    Do not use this option unless you fully understand why you are getting errors. If there are no bugs in your replication setup and client programs, and no bugs in MySQL itself, an error that stops replication should never occur. Indiscriminate use of this option results in slaves becoming hopelessly out of synchrony with the master, with you having no idea why this has occurred.

    For error codes, you should use the numbers provided by the error message in your slave error log and in the output of SHOW SLAVE STATUS. Appendix B, Errors, Error Codes, and Common Problems, lists server error codes.

    You can also (but should not) use the very non-recommended value of all to cause the slave to ignore all error messages and keeps going regardless of what happens. Needless to say, if you use all, there are no guarantees regarding the integrity of your data. Please do not complain (or file bug reports) in this case if the slave's data is not anywhere close to what it is on the master. You have been warned.

    Examples:

    --slave-skip-errors=1062,1053
    --slave-skip-errors=all
    

15.1.3. Common Replication Administration Tasks

Once replication has been started it should execute without requiring much regular administration. Depending on your replication environment, you will want to check the replication status of each slave either periodically, daily, or even more frequently.

MySQL Enterprise For regular reports regarding the status of your slaves, subscribe to the MySQL Network Monitoring and Advisory Service. For more information see http://www.mysql.com/products/enterprise/advisors.html.

15.1.3.1. Checking Replication Status

The most common task when managing a replication process is to ensure that replication is taking place and that there have been no errors between the slave and the master.

The primary command for this is SHOW SLAVE STATUS which you must execute on each slave:

mysql> SHOW SLAVE STATUS\G
*************************** 1. row ***************************
             Slave_IO_State: Waiting for master to send event
                Master_Host: master1
                Master_User: root
                Master_Port: 3306
              Connect_Retry: 60
            Master_Log_File: mysql-bin.000004
        Read_Master_Log_Pos: 931
             Relay_Log_File: slave1-relay-bin.000056
              Relay_Log_Pos: 950
      Relay_Master_Log_File: mysql-bin.000004
           Slave_IO_Running: Yes
          Slave_SQL_Running: Yes
            Replicate_Do_DB: 
        Replicate_Ignore_DB: 
         Replicate_Do_Table: 
     Replicate_Ignore_Table: 
    Replicate_Wild_Do_Table: 
Replicate_Wild_Ignore_Table: 
                 Last_Errno: 0
                 Last_Error: 
               Skip_Counter: 0
        Exec_Master_Log_Pos: 931
            Relay_Log_Space: 1365
            Until_Condition: None
             Until_Log_File: 
              Until_Log_Pos: 0
         Master_SSL_Allowed: No
         Master_SSL_CA_File: 
         Master_SSL_CA_Path: 
            Master_SSL_Cert: 
          Master_SSL_Cipher: 
             Master_SSL_Key: 
      Seconds_Behind_Master: 0
1 row in set (0.01 sec)

The key fields from the status report to examine are:

  • Slave_IO_State — indicates the current status of the slave. See Section 6.5.5.5, “Replication Slave I/O Thread States”, and Section 6.5.5.6, “Replication Slave SQL Thread States”, for more information.

  • Slave_IO_Running — shows whether the IO thread for the reading the master's binary log is running.

  • Slave_SQL_Running — shows whether the SQL thread for the executing events in the relay log is running.

  • Last_Error — shows the last error registered when processing the relay log. Ideally this should be blank, indicating no errors.

  • Seconds_Behind_Master — shows the number of seconds that the slave SQL thread is behind processing the master binary log. A high number (or an increasing one) can indicate that the slave is unable to cope with the large number of queries from the master.

On the master, you can check the status of slaves by examining the list of running processes. Slaves execute the Binlog Dump command:

mysql> SHOW PROCESSLIST \G;
*************************** 4. row ***************************
     Id: 10
   User: root
   Host: slave1:58371
     db: NULL
Command: Binlog Dump
   Time: 777
  State: Has sent all binlog to slave; waiting for binlog to be updated
   Info: NULL

Because it is the slave that drives the core of the replication process, very little information is available in this report.

If you have used the --report-host option, then the SHOW SLAVE HOSTS statement will show basic information about connected slaves:

mysql> SHOW SLAVE HOSTS;
+-----------+--------+------+-------------------+-----------+
| Server_id | Host   | Port | Rpl_recovery_rank | Master_id |
+-----------+--------+------+-------------------+-----------+
|        10 | slave1 | 3306 |                 0 |         1 | 
+-----------+--------+------+-------------------+-----------+
1 row in set (0.00 sec)

The output includes the ID of the slave server, the value of the --report-host option, the connecting port, master ID and the priority of the slave for receiving binary log updates.

15.1.3.2. Pausing Replication on the Slave

You can stop and start the replication of statements on the slave using the STOP SLAVE and START SLAVE commands.

To stop execution of the binary log from the master, use STOP SLAVE:

mysql> STOP SLAVE;

When execution is stopped, the slave does not read the binary log from the master (the IO_THREAD) and stops processing events from the relay log that have not yet been executed (the SQL_THREAD). You can pause either the IO or SQL threads individually by specifying the thread type. For example:

mysql> STOP SLAVE IO_THREAD;

Stopping the SQL thread can be useful if you want to perform a backup or other task on a slave that only processes events from the master. The IO thread will continue to be read from the master, but not executed, which will make it easier for the slave to catch up when you start slave operations again.

Stopping the IO thread will allow the statements in the relay log to be executed up until the point where the relay log has ceased to receive new events. Using this option can be useful when you want to pause execution to allow the slave to catch up with events from the master, when you want to perform administration on the slave but also ensure you have the latest updates to a specific point. This method can also be used to pause execution on the slave while you conduct administration on the master while ensuring that there is not a massive backlog of events to be executed when replication is started again.

To start execution again, use the START SLAVE statement:

mysql> START SLAVE;

If necessary, you can start either the IO_THREAD or SQL_THREAD threads individually.

15.2. Replication Solutions

Replication can be used in many different environments for a range of purposes. In this section you will find general notes and advice on using replication for specific solution types.

For information on using replication in a backup environment, including notes on the setup, backup procedure, and files to back up, see Section 15.2.1, “Using Replication for Backups”.

For advice and tips on using different storage engines on the master and slaves, see Section 15.2.2, “Using Replication with Different Master and Slave Storage Engines”.

Using replication as a scale-out solution requires some changes in the logic and operation of applications that use the solution. See Section 15.2.3, “Using Replication for Scale-Out”.

For performance or data distribution reasons you may want to replicate different databases to different replication slaves. See Section 15.2.4, “Replicating Different Databases to Different Slaves”

As the number of replication slaves increases, the load on the master can increase (because of the need to replicate the binary log to each slave) and lead to a reduction in performance of the master. For tips on improving your replication performance, including using a single secondary server as an replication master, see Section 15.2.5, “Improving Replication Performance”.

For guidance on switching masters, or converting slaves into masters as part of an emergency failover solution, see Section 15.2.6, “Switching Masters During Failover”.

To secure your replication communication you can encrypt the communication channel by using SSL to exchange data. Step-by-step instructions can be found in Section 15.2.7, “Setting Up Replication Using SSL”.

15.2.1. Using Replication for Backups

You can use replication as a backup solution by replicating data from the master to a slave, and then backing up the data slave. Because the slave can be paused and shut down without affecting the running operation of the master you can produce an effective snapshot of 'live' data that would otherwise require a shutdown of the master database.

How you back up the database will depend on the size of the database and whether you are backing up only the data, or the data and the replication slave state so that you can rebuild the slave in the event of failure. There are therefore two choices:

If you are using replication as a solution to enable you to back up the data on the master, and the size of your database is not too large, then the mysqldump tool may be suitable. See Section 15.2.1.1, “Backing Up Using mysqldump.

For larger databases, where mysqldump would be impractical or inefficient, you can back up the raw data files instead. Using the raw data files option also means that you can back up the binary and relay logs that will enable you to recreate the slave in the event of a slave failure. For more information, see Section 15.2.1.2, “Backing Up Raw Data”.

15.2.1.1. Backing Up Using mysqldump

Using mysqldump to create a copy of the database enables you to capture all of the data in the database in a format that allows the information to be imported into another instance of MySQL. Because the format of the information is SQL statements the file can easily be distributed and applied to running servers in the event that you need access to the data in an emergency. However, if the size of your data set is very large then mysqldump may be impractical.

When using mysqldump you should stop the slave before starting the dump process to ensure that the dump contains a consistent set of data:

  1. Stop the slave from processing requests. You can either stop the slave completely using mysqladmin:

    shell> mysqladmin stop-slave

    Alternatively, you can stop processing the relay log files by stopping the replication SQL thread. Using this method will allow the binary log data to be transferred. Within busy replication environments this may speed up the catch-up process when you start the slave processing again:

    shell> mysql -e 'STOP SLAVE SQL_THREAD;'
  2. Run mysqldump to dump your databases. You may either select databases to be dumped, or dump all databases. For more information see Section 7.13, “mysqldump — A Database Backup Program”. For example, to dump all databases:

    shell> mysqldump --all-databases >fulldb.dump
  3. Once the dump has completed, start slave operations again:

    shell> mysqladmin start-slave

In the preceding example you may want to add login credentials (username, password) to the commands, and bundle the process up into a script that you can run automatically each day.

If you use this approach, make sure you monitor the slave replication process to ensure that the time taken to run the backup in this way is not affecting the slaves ability to keep up with events from the master. See Section 15.1.3.1, “Checking Replication Status”. If the slave is unable to keep up you may want to add another server and distribute the backup process. For an example of how to configure this scenario, see Section 15.2.4, “Replicating Different Databases to Different Slaves”.

15.2.1.2. Backing Up Raw Data

To guarantee the integrity of the files that are copied, backing up the raw data files on your MySQL replication slave should take place while your slave server is shut down. If the MySQL server is still running then background tasks, particularly with storage engines with background processes such as InnoDB, may still be updating the database files. With InnoDB, these problems should be resolved during crash recovery, but since the slave server can be shut down during the backup process without affecting the execution of the master it makes sense to take advantage of this facility.

To shut down the server and back up the files:

  1. Shut down the slave MySQL server:

    shell> mysqladmin shutdown
  2. Copy the data files. You can use any suitable copying or archive utility, including cp, tar or WinZip:

    tar cf /tmp/dbbackup.tar ./data
  3. Start up the mysqld process again:

    shell> mysqld_safe &

    Under Windows:

    C:\> "C:\Program Files\MySQL\MySQL Server 5.1\bin\mysqld"

Normally you should back up the entire data folder for the slave MySQL server. If you want to be able to restore the data and operate as a slave (for example, in the event of failure of the slave), then when you back up the slave's data, you should back up the slave status files, master.info and relay.info, along with the relay log files. These files are needed to resume replication after you restore the slave's data.

If you lose the relay logs but still have the relay-log.info file, you can check it to determine how far the SQL thread has executed in the master binary logs. Then you can use CHANGE MASTER TO with the MASTER_LOG_FILE and MASTER_LOG_POS options to tell the slave to re-read the binary logs from that point. Of course, this requires that the binary logs still exist on the master server.

If your slave is subject to replicating LOAD DATA INFILE statements, you should also back up any SQL_LOAD-* files that exist in the directory that the slave uses for this purpose. The slave needs these files to resume replication of any interrupted LOAD DATA INFILE operations. The directory location is specified using the --slave-load-tmpdir option. If this option is not specified, the directory location is the value of the tmpdir system variable.

15.2.2. Using Replication with Different Master and Slave Storage Engines

The replication process does not care if the source table on the master and the replicated table on the slave use different engine types. In fact, the system variables storage_engine and table_type are not replicated.

This provides a number of advantages in the replication process in that you can take advantage of different engine types for different replication scenarios. For example, in a typical scaleout scenario (see Section 15.2.3, “Using Replication for Scale-Out”), you want to use InnoDB tables on the master to take advantage of the transactional functionality, but use MyISAM on the slaves where transaction support is not required because the data is only read. When using replication in a data logging environment you may want to use the Archive storage engine on the slave.

Setting up different engines on the master and slave depends on how you set up the initial replication process:

  • If you used mysqldump to create the database snapshot on your master then you could edit the dump text to change the engine type used on each table.

    Another alternative for mysqldump is to disable engine types that you do not want to use on the slave before using the dump to build the data on the slave. For example, you can add the --skip-innodb option on your slave to disable the InnoDB engine. If a specific engine does not exist, MySQL will use the default engine type, usually MyISAM. If you want to disable further engines in this way, you may want to consider building a special binary to be used on the slave that only supports the engines you want.

  • If you are using raw data files for the population of the slave, you will be unable to change the initial table format. Instead, use ALTER TABLE to change the table types after the slave has been started.

  • For new master/slave replication setups where there are currently no tables on the master, avoid specifying the engine type when creating new tables.

If you are already running a replication solution and want to convert your existing tables to another engine type, follow these steps:

  1. Stop the slave from running replication updates:

    mysql> STOP SLAVE;

    This will enable you to change engine types without interruptions.

  2. Execute an ALTER TABLE ... Engine='enginetype' for each table where you want to change the engine type.

  3. Start the slave replication process again:

    mysql> START SLAVE;

Although the storage_engine and table_type variables are not replicated, be aware that CREATE TABLE and ALTER TABLE statements that include the engine specification will be correctly replicated to the slave. For example, if you have a CSV table and you execute:

mysql> ALTER TABLE csvtable Engine='MyISAM';

The above statement will be replicated to the slave and the engine type on the slave will be converted to MyISAM, even if you have previously changed the table type on the slave to an engine other than CSV. If you want to retain engine differences on the master and slave, you should be careful to use the storage_engine variable on the master when creating a new table. For example, instead of:

mysql> CREATE TABLE tablea (columna int) Engine=MyISAM;

Use this format:

mysql> SET storage_engine=MyISAM;
mysql> CREATE TABLE tablea (columna int);

When replicated, the storage_engine variable will be ignored, and the CREATE TABLE statement will be executed with the slave's default engine type.

15.2.3. Using Replication for Scale-Out

You can use replication as a scale-out solution, i.e. where you want to split up the load of database queries across multiple database servers, within some reasonable limitations.

Because replication works from the distribution of one master to one or more slaves, using replication for scaleout works best in an environment where you have a high number of reads and low number of writes/updates. Most websites fit into this category, where users are browsing the website, reading articles, posts, or viewing products. Updates only occur during session management, or when making a purchase or adding a comment/message to a forum.

Replication in this situation enables you to distribute the reads over the replication slaves, while still allowing your web servers to communicate with the replication master when a write is required. You can see a sample replication layout for this scenario in Figure 15.1, “Using replication to improve the performance during scaleout”.

Figure 15.1. Using replication to improve the performance during scaleout

Using replication to improve the performance
          during scaleout

If the part of your code that is responsible for database access has been properly abstracted/modularized, converting it to run with a replicated setup should be very smooth and easy. Change the implementation of your database access to send all writes to the master, and to send reads to either the master or a slave. If your code does not have this level of abstraction, setting up a replicated system gives you the opportunity and motivation to clean it up. Start by creating a wrapper library or module that implements the following functions:

  • safe_writer_connect()

  • safe_reader_connect()

  • safe_reader_statement()

  • safe_writer_statement()

safe_ in each function name means that the function takes care of handling all error conditions. You can use different names for the functions. The important thing is to have a unified interface for connecting for reads, connecting for writes, doing a read, and doing a write.

Then convert your client code to use the wrapper library. This may be a painful and scary process at first, but it pays off in the long run. All applications that use the approach just described are able to take advantage of a master/slave configuration, even one involving multiple slaves. The code is much easier to maintain, and adding troubleshooting options is trivial. You need modify only one or two functions; for example, to log how long each statement took, or which statement among those issued gave you an error.

If you have written a lot of code, you may want to automate the conversion task by using the replace utility that comes with standard MySQL distributions, or write your own conversion script. Ideally, your code uses consistent programming style conventions. If not, then you are probably better off rewriting it anyway, or at least going through and manually regularizing it to use a consistent style.

15.2.4. Replicating Different Databases to Different Slaves

There may be situations where you have a single master and want to replicate different databases to different slaves. For example, you may want to distribute different sales data to different departments to help spread the load during data analysis. A sample of this layout is shown in Figure 15.2, “Using replication to replicate separate DBs to multiple hosts”.

Figure 15.2. Using replication to replicate separate DBs to multiple hosts

Using replication to replicate separate DBs
          to multiple hosts

You can achieve this separation by configuring the master and slaves as normal, and then limiting the binary log statements that each slave processes by using the replicate-wild-do-table configuration option on each slave.

For example, to support the separation as shown in Figure 15.2, “Using replication to replicate separate DBs to multiple hosts”, you would configure each slave as follows before enabling replication using START SLAVE:

  • MySQL Slave 1 should have the following configuration options:

    replicate-wild-do-table=sales.%
    replicate-wild-do-table=finance.%
  • MySQL Slave 2 should have the following configuration option:

    replicate-wild-do-table=support.%
  • MySQL Slave 3 should have the following configuration option:

    replicate-wild-do-table=service.%

If you have data that needs to be synchronized to the slaves before replication starts, you have a number of options:

  • Synchronize all the data to each slave, and delete the databases and/or tables that you do not want to keep.

  • Use mysqldump to create a separate dump file for each database and load the appropriate dump file on each slave.

  • Use a raw data file dump and include only the specific files and databases that you need for each slave. This option will not work with InnoDB databases unless you use the innodb_file_per_table option.

Each slave in this configuration will transfer to the entire binary log from the master, but will only execute the events within the binary log that apply to the configured databases and tables.

15.2.5. Improving Replication Performance

As the number of slaves connecting to a master increases, the load, although minimal, also increases, as each slave uses up a client connection to the master. Also, as each slave must receive a full copy of the master binary log, the network load on the master may also increase and start to create a bottleneck.

If you are using a large number of slaves connected to one master, and that master is also busy processing requests (for example, as part of a scaleout solution), then you may want to improve the performance of the replication process.

One way to improve the performance of the replication process is to create a deeper replication structure that enables the master to replicate to only one slave, and for the remaining slaves to connect to this primary slave for their individual replication requirements. A sample of this structure is shown in Figure 15.3, “Using an additional replication host to improve performance”.

Figure 15.3. Using an additional replication host to improve performance

Using an additional replication host to
          improve performance

For this to work, you must configure the MySQL instances as follows:

  • Master 1 is the primary master where all changes and updates are written to the database. Binary logging should be enabled on this machine.

  • Master 2 is the slave to the Master 1 that provides the replication functionality to the remainder of the slaves in the replication structure. Master 2 is the only machine allowed to connect to Master 1. Master 2 also has binary logging enabled, and the --log-slave-updates option so that replication instructions from Master 1 are also written to Master 2's binary log so that they can then be replicated to the true slaves.

  • Slave 1, Slave 2, and Slave 3 act as slaves to Master 2, and replicate the information from Master 2, which is really the data logged on Master 1.

The above solution reduces the client load and the network interface load on the primary master, which should improve the overall performance of the primary master when used as a direct database solution.

If your slaves are having trouble keeping up with the replication process on the master then there are a number of options available:

  • If possible, you should put the relay logs and the data files on different physical drives. To do this, use the --relay-log option to specify the location of the relay log.

  • If the slaves are significantly slower than the master, then you may want to divide up the responsibility for replicating different databases to different slaves. See Section 15.2.4, “Replicating Different Databases to Different Slaves”.

  • If your master makes use of transactions and you are not concerned about transaction support on your slaves, then use MyISAM or another non-transactional engine. See Section 15.2.2, “Using Replication with Different Master and Slave Storage Engines”.

  • If your slaves are not acting as masters, and you have a potential solution in place to ensure that you can bring up a master in the event of failure, then you can switch off --log-slave-updates. This prevents 'dumb' slaves from also logging events they have executed into their own binary log.

15.2.6. Switching Masters During Failover

There is currently no official solution for providing failover between master and slaves in the event of a failure. With the currently available features, you would have to set up a master and a slave (or several slaves), and to write a script that monitors the master to check whether it is up. Then instruct your applications and the slaves to change master in case of failure.

Remember that you can tell a slave to change its master at any time, using the CHANGE MASTER TO statement. The slave will not check whether the databases on the master are compatible with the slave, it will just start executing events from the specified log and postition on the new master. In a failover situation all the servers in the group are probably executing the same events from the same binary log, so changing the source of the events should not affect the database structure or integrity providing you are careful.

Run your slaves with the --log-bin option and without --log-slave-updates. In this way, the slave is ready to become a master as soon as you issue STOP SLAVE; RESET MASTER, and CHANGE MASTER TO statement on the other slaves. For example, assume that you have the structure shown in Figure 15.4, “Redundancy using replication, initial structure”.

Figure 15.4. Redundancy using replication, initial structure

Redundancy using replication, initial
          structure

In this diagram, the MySQL Master holds the master database, the MySQL Slave computers are replication slaves, and the Web Client machines are issuing database reads and writes. Web clients that issue only reads (and would normally be connected to the slaves) are not shown, as they do not need to switch to a new server in the event of failure. For a more detailed example of a read/write scaleout replication structure, see Section 15.2.3, “Using Replication for Scale-Out”.

Each MySQL Slave (Slave 1, Slave 2, and Slave 3) are slaves running with --log-bin and without --log-slave-updates. Because updates received by a slave from the master are not logged in the binary log unless --log-slave-updates is specified, the binary log on each slave is empty initially. If for some reason MySQL Master becomes unavailable, you can pick one of the slaves to become the new master. For example, if you pick Slave 1, all Web Clients should be redirected to Slave 1, which will log updates to its binary log. Slave 2 and Slave 3 should then replicate from Slave 1.

The reason for running the slave without --log-slave-updates is to prevent slaves from receiving updates twice in case you cause one of the slaves to become the new master. Suppose that Slave 1 has --log-slave-updates enabled. Then it will write updates that it receives from Master to its own binary log. When Slave 2 changes from Master to Slave 1 as its master, it may receive updates from Slave 1 that it has already received from Master

Make sure that all slaves have processed any statements in their relay log. On each slave, issue STOP SLAVE IO_THREAD, then check the output of SHOW PROCESSLIST until you see Has read all relay log. When this is true for all slaves, they can be reconfigured to the new setup. On the slave Slave 1 being promoted to become the master, issue STOP SLAVE and RESET MASTER.

On the other slaves Slave 2 and Slave 3, use STOP SLAVE and CHANGE MASTER TO MASTER_HOST='Slave1' (where 'Slave1' represents the real hostname of Slave 1). To CHANGE MASTER, add all information about how to connect to Slave 1 from Slave 2 or Slave 3 (user, password, port). In CHANGE MASTER, there is no need to specify the name of Slave 1's binary log or binary log position to read from: We know it is the first binary log and position 4, which are the defaults for CHANGE MASTER. Finally, use START SLAVE on Slave 2 and Slave 3.

Once the new replication is in place, you will then need to instruct each Web Client to direct their statements to Slave 1. From that point on, all updates statements sent by Web Client to Slave 1 are written to the binary log of Slave 1, which then contains every update statement sent to Slave 1 since Master died.

The resulting server structure is shown in Figure 15.5, “Redundancy using replication, after master failure”.

Figure 15.5. Redundancy using replication, after master failure

Redundancy using replication, after master
          failure

When Master is up again, you must issue on it the same CHANGE MASTER as that issued on Slave 2 and Slave 3, so that Master becomes a slave of S1 and picks up each Web Client writes that it missed while it was down.

To make Master a master again (because it is the most powerful machine, for example), use the preceding procedure as if Slave 1 was unavailable and Master was to be the new master. During this procedure, do not forget to run RESET MASTER on Master before making Slave 1, Slave 2, and Slave 3 slaves of Master. Otherwise, they may pick up old Web Client writes from before the point at which Master became unavailable.

Note that there is no synchronization between the different slaves to a master. Some slaves might be ahead of others. This means that the concept outlined in the previous example might not work. In practice, however, the relay logs of different slaves will most likely not be far behind the master, so it would work, anyway (but there is no guarantee).

A good way to keep your applications informed as to the location of the master is by having a dynamic DNS entry for the master. With bind you can use nsupdate to dynamically update your DNS.

15.2.7. Setting Up Replication Using SSL

Setting up replication using an SSL connection is similar to setting up a server and client using SSL. You will need to obtain (or create) a suitable security certificate that you can use on the master, and a similar certificate (from the same certificate authority) on each slave.

To use SSL for encrypting the transfer of the binary log required during replication you must first set up the master to support SSL network connections. If the master does not support SSL connections (because it has not been compiled or configured for SSL), then replication through an SSL connection will not be possible.

For more information on setting up a server and client for SSL connectivity, see Section 5.8.7.2, “Using SSL Connections”.

To enable SSL on the master you will need to create or obtain suitable certficates and then add the following configuration options to the master's configuration within the mysqld section:

ssl-ca=cacert.pem
ssl-cert=server-cert.pem
ssl-key=server-key.pem

The options are as follows:

  • ssl-ca identifies the Certificate Authority (CA) certificate.

  • ssl-cert identifies the server public key. This can be sent to the client and authenticated against the CA certificate that it has.

  • ssl-key identifies the server private key.

On the slave, you have two options available for setting the SSL information. You can either add the slaves certificates to the client section of the slave configuration file, or you can explicitly specify the SSL information using the CHANGE MASTER statement.

Using the former option, add the following lines to the client section of the slave configuration file:

[client]
ssl-ca=cacert.pem
ssl-cert=server-cert.pem
ssl-key=server-key.pem

Restart the slave server, using the --skip-slave to prevent the slave from connecting to the master. Use CHANGE MASTER to specify the master configuration, using the master_ssl option to enable SSL connectivity:

mysql> CHANGE MASTER TO \
    MASTER_HOST='master_hostname', \
    MASTER_USER='replicate', \
    MASTER_PASSWORD='password', \
    MASTER_SSL=1;

To specify the SSL certificate options during the CHANGE MASTER command, append the SSL options:

CHANGE MASTER TO \
      MASTER_HOST='master_hostname', \
      MASTER_USER='replicate', \
      MASTER_PASSWORD='password', \
      MASTER_SSL=1, \
      MASTER_SSL_CA = 'ca_file_name', \
      MASTER_SSL_CAPATH = 'ca_directory_name', \
      MASTER_SSL_CERT = 'cert_file_name', \
      MASTER_SSL_KEY = 'key_file_name';

Once the master information has been updated, start the slave replication process:

mysql> START SLAVE;

You can use the SHOW SLAVE STATUS to confirm that SSL connection has been completed.

For more information on the CHANGE MASTER TO syntax, see Section 12.6.2.1, “CHANGE MASTER TO Syntax”.

If you want to enforce SSL connections to be used during replication, then create a user with the REPLICATION SLAVE privilege and use the REQUIRE_SSL option for that user. For example:

mysql> GRANT REPLICATION SLAVE ON *.*
    -> TO 'repl'@'%.mydomain.com' IDENTIFIED BY 'slavepass' REQUIRE SSL;

15.3. Replication Notes and Tips

15.3.1. Replication Features and Issues

In general, replication compatibility at the SQL level requires that any features used be supported by both the master and the slave servers. If you use a feature on a master server that is available only as of a given version of MySQL, you cannot replicate to a slave that is older than that version. Such incompatibilities are likely to occur between series, so that, for example, you cannot replicate from MySQL 5.0 to 4.1. However, these incompatibilities also can occur for within-series replication. For example, the SLEEP() function is available in MySQL 5.0.12 and up. If you use this function on the master server, you cannot replicate to a slave server that is older than MySQL 5.0.12.

If you are planning to use replication between 5.0 and a previous version of MySQL you should consult the edition of the MySQL Reference Manual corresponding to the earlier release series for information regarding the replication characteristics of that series.

The following list provides details about what is supported and what is not. Additional InnoDB-specific information about replication is given in Section 13.2.6.5, “InnoDB and MySQL Replication”.

Replication issues with regard to stored routines and triggers is described in Section 18.4, “Binary Logging of Stored Routines and Triggers”.

15.3.1.1. Replication and AUTO_INCREMENT

Replication of AUTO_INCREMENT, LAST_INSERT_ID(), and TIMESTAMP values is done correctly, subject to the following exceptions.

  • INSERT DELAYED ... VALUES(LAST_INSERT_ID()) inserts a different value on the master and the slave. (Bug#20819) This is fixed in MySQL 5.1 when using row-based or mixed-format binary logging.

  • Before MySQL 5.0.26, a stored procedure that uses LAST_INSERT_ID() does not replicate properly.

  • When a statement uses a stored function that inserts into an AUTO_INCREMENT column, the generated AUTO_INCREMENT value is not written into the binary log, so a different value can in some cases be inserted on the slave. This is also true of a trigger that causes an INSERT into an AUTO_INCREMENT column.

  • Adding an AUTO_INCREMENT column to a table with ALTER TABLE might not produce the same ordering of the rows on the slave and the master. This occurs because the order in which the rows are numbered depends on the specific storage engine used for the table and the order in which the rows were inserted. If it is important to have the same order on the master and slave, the rows must be ordered before assigning an AUTO_INCREMENT number. Assuming that you want to add an AUTO_INCREMENT column to the table t1, the following statements produce a new table t2 identical to t1 but with an AUTO_INCREMENT column:

    CREATE TABLE t2 LIKE t1;
    ALTER TABLE t2 ADD id INT AUTO_INCREMENT PRIMARY KEY;
    INSERT INTO t2 SELECT * FROM t1 ORDER BY col1, col2;
    

    This assumes that the table t1 has columns col1 and col2.

    Important

    To guarantee the same ordering on both master and slave, all columns of t1 must be referenced in the ORDER BY clause.

  • The instructions just given are subject to the limitations of CREATE TABLE ... LIKE: Foreign key definitions are ignored, as are the DATA DIRECTORY and INDEX DIRECTORY table options. If a table definition includes any of those characteristics, create t2 using a CREATE TABLE statement that is identical to the one used to create t1, but with the addition of the AUTO_INCREMENT column.

  • Regardless of the method used to create and populate the copy having the AUTO_INCREMENT column, the final step is to drop the original table and then rename the copy:

    DROP t1;
    ALTER TABLE t2 RENAME t1;
    

    See also Section B.1.7.1, “Problems with ALTER TABLE.

15.3.1.2. Replication and Character Sets

The following applies to replication between MySQL servers that use different character sets:

  1. If the master uses MySQL 4.1, you must always use the same global character set and collation on the master and the slave, regardless of the MySQL version running on the slave. (These are controlled by the --character-set-server and --collation-server options.) Otherwise, you may get duplicate-key errors on the slave, because a key that is unique in the master character set might not be unique in the slave character set. Note that this is not a cause for concern when master and slave are both MySQL 5.0 or later.

  2. If the master is older than MySQL 4.1.3, the character set of any client should never be made different from its global value because this character set change is not known to the slave. In other words, clients should not use SET NAMES, SET CHARACTER SET, and so forth. If both the master and the slave are 4.1.3 or newer, clients can freely set session values for character set variables because these settings are written to the binary log and so are known to the slave. That is, clients can use SET NAMES or SET CHARACTER SET or can set variables such as collation_client or collation_server. However, clients are prevented from changing the global value of these variables; as stated previously, the master and slave must always have identical global character set values.

  3. If you have databases on the master with character sets that differ from the global character_set_server value, you should design your CREATE TABLE statements so that tables in those databases do not implicitly rely on the database default character set (see Bug#2326). A good workaround is to state the character set and collation explicitly in CREATE TABLE statements.

15.3.1.3. Replication DIRECTORY Statements

If a DATA DIRECTORY or INDEX DIRECTORY table option is used in a CREATE TABLE statement on the master server, the table option is also used on the slave. This can cause problems if no corresponding directory exists in the slave host filesystem or if it exists but is not accessible to the slave server. MySQL supports an sql_mode option called NO_DIR_IN_CREATE. If the slave server is run with this SQL mode enabled, it ignores the DATA DIRECTORY and INDEX DIRECTORY table options when replicating CREATE TABLE statements. The result is that MyISAM data and index files are created in the table's database directory.

15.3.1.4. Replication with Floating-Point Values

Floating-point values are approximate, so comparisons involving them are inexact. This is true for operations that use floating-point values explicitly, or values that are converted to floating-point implicitly. Comparisons of floating-point values might yield different results on master and slave servers due to differences in computer architecture, the compiler used to build MySQL, and so forth. See Section 11.2.2, “Type Conversion in Expression Evaluation”, and Section B.1.5.8, “Problems with Floating-Point Comparisons”.

MySQL Enterprise For expert advice regarding replication subscribe to the MySQL Enterprise Monitor. For more information see http://www.mysql.com/products/enterprise/advisors.html.

15.3.1.5. Replication and FLUSH

Some forms of the FLUSH statement are not logged because they could cause problems if replicated to a slave: FLUSH LOGS, FLUSH MASTER, FLUSH SLAVE, and FLUSH TABLES WITH READ LOCK. For a syntax example, see Section 12.5.5.2, “FLUSH Syntax”. The FLUSH TABLES, ANALYZE TABLE, OPTIMIZE TABLE, and REPAIR TABLE statements are written to the binary log and thus replicated to slaves. This is not normally a problem because these statements do not modify table data. However, this can cause difficulties under certain circumstances. If you replicate the privilege tables in the mysql database and update those tables directly without using GRANT, you must issue a FLUSH PRIVILEGES on the slaves to put the new privileges into effect. In addition, if you use FLUSH TABLES when renaming a MyISAM table that is part of a MERGE table, you must issue FLUSH TABLES manually on the slaves. These statements are written to the binary log unless you specify NO_WRITE_TO_BINLOG or its alias LOCAL.

15.3.1.6. Replication and Functions

Certain functions do not replicate well under some conditions:

  • The USER(), CURRENT_USER(), UUID(), VERSION(), and LOAD_FILE() functions are replicated without change and thus do not work reliably on the slave.

  • As of MySQL 5.0.13, the SYSDATE() function is no longer equivalent to NOW(). Implications are that SYSDATE() is not replication-safe because it is not affected by SET TIMESTAMP statements in the binary log and is non-deterministic. To avoid this, you can start the server with the --sysdate-is-now option to cause SYSDATE() to be an alias for NOW().

  • The GET_LOCK(), RELEASE_LOCK(), IS_FREE_LOCK(), and IS_USED_LOCK() functions that handle user-level locks are replicated without the slave knowing the concurrency context on master. Therefore, these functions should not be used to insert into a master's table because the content on the slave would differ. (For example, do not issue a statement such as INSERT INTO mytable VALUES(GET_LOCK(...)).)

As a workaround for the preceding limitations, you can use the strategy of saving the problematic function result in a user variable and referring to the variable in a later statement. For example, the following single-row INSERT is problematic due to the reference to the UUID() function:

INSERT INTO t VALUES(UUID());

To work around the problem, do this instead:

SET @my_uuid = UUID();
INSERT INTO t VALUES(@my_uuid);

That sequence of statements replicates because the value of @my_uuid is stored in the binary log as a user-variable event prior to the INSERT statement and is available for use in the INSERT.

The same idea applies to multiple-row inserts, but is more cumbersome to use. For a two-row insert, you can do this:

SET @my_uuid1 = UUID(); @my_uuid2 = UUID();
INSERT INTO t VALUES(@my_uuid1),(@my_uuid2);

However, if the number of rows is large or unknown, the workaround is difficult or impracticable. For example, you cannot convert the following statement to one in which a given individual user variable is associated with each row:

INSERT INTO t2 SELECT UUID(), * FROM t1;

Non-delayed INSERT statements that refer to RAND() or user-defined variables replicate correctly. However, changing the statements to use INSERT DELAYED can result in different results on master and slave.

The FOUND_ROWS() and ROW_COUNT() functions are also not replicated reliably.

15.3.1.7. Replication and LOAD ... Operations

Using LOAD TABLE FROM MASTER where the master is running MySQL 4.1 and the slave is running MySQL 5.0 may corrupt the table data, and is not supported. (Bug#16261)

The following applies only if either the master or the slave is running MySQL version 5.0.3 or older: If on the master a LOAD DATA INFILE is interrupted (integrity constraint violation, killed connection, and so on), the slave skips the LOAD DATA INFILE entirely. This means that if this command permanently inserted or updated table records before being interrupted, these modifications are not replicated to the slave.

15.3.1.8. Replication During a Master Crash

A crash on the master side can result in the master's binary log having a final position less than the most recent position read by the slave, due to the master's binary log file not being flushed. This can cause the slave not to be able to replicate when the master comes back up. Setting sync_binlog=1 in the master my.cnf file helps to minimize this problem because it causes the master to flush its binary log more frequently.

15.3.1.9. Replication During a Master Shutdown

It is safe to shut down a master server and restart it later. When a slave loses its connection to the master, the slave tries to reconnect immediately and retries periodically if that fails. The default is to retry every 60 seconds. This may be changed with the --master-connect-retry option. A slave also is able to deal with network connectivity outages. However, the slave notices the network outage only after receiving no data from the master for slave_net_timeout seconds. If your outages are short, you may want to decrease slave_net_timeout. See Section 5.2.3, “System Variables”.

15.3.1.10. Replication with MEMORY Tables

When a server shuts down and restarts, its MEMORY (HEAP) tables become empty. The master replicates this effect to slaves as follows: The first time that the master uses each MEMORY table after startup, it logs an event that notifies the slaves that the table needs to be emptied by writing a DELETE statement for that table to the binary log. See Section 13.4, “The MEMORY (HEAP) Storage Engine”, for more information about MEMORY tables.

15.3.1.11. Replication and the Query Optimizer

It is possible for the data on the master and slave to become different if a statement is designed in such a way that the data modification is non-deterministic; that is, left to the will of the query optimizer. (This is in general not a good practice, even outside of replication.) For a detailed explanation of this issue, see Section B.1.8.1, “Open Issues in MySQL”.

15.3.1.12. Slave Errors during Replication

If a statement on a slave produces an error, the slave SQL thread terminates, and the slave writes a message to its error log. You should then connect to the slave manually and determine the cause of the problem. (SHOW SLAVE STATUS is useful for this.) Then fix the problem (for example, you might need to create a non-existent table) and run START SLAVE.

15.3.1.13. Replication during a Slave Shutdown

Shutting down the slave (cleanly) is also safe because it keeps track of where it left off. Unclean shutdowns might produce problems, especially if the disk cache was not flushed to disk before the system went down. Your system fault tolerance is greatly increased if you have a good uninterruptible power supply. Unclean shutdowns of the master may cause inconsistencies between the content of tables and the binary log in master; this can be avoided by using InnoDB tables and the --innodb-safe-binlog option on the master. See Section 5.11.3, “The Binary Log”.

Note

--innodb-safe-binlog is unneeded as of MySQL 5.0.3, having been made obsolete by the introduction of XA transaction support.

15.3.1.14. Replication and Temporary Tables

Temporary tables are replicated except in the case where you shut down the slave server (not just the slave threads) and you have replicated temporary tables that are used in updates that have not yet been executed on the slave. If you shut down the slave server, the temporary tables needed by those updates are no longer available when the slave is restarted. To avoid this problem, do not shut down the slave while it has temporary tables open. Instead, use the following procedure:

  1. Issue a STOP SLAVE statement.

  2. Use SHOW STATUS to check the value of the Slave_open_temp_tables variable.

  3. If the value is 0, issue a mysqladmin shutdown command to stop the slave.

  4. If the value is not 0, restart the slave threads with START SLAVE.

  5. Repeat the procedure later until the Slave_open_temp_tables variable is 0 and you can stop the slave.

15.3.1.15. Replication Retries and Timeouts

In MySQL 5.0 (starting from 5.0.3), there is a global system variable slave_transaction_retries: If the replication slave SQL thread fails to execute a transaction because of an InnoDB deadlock or because it exceeded the InnoDB innodb_lock_wait_timeout or the NDBCluster TransactionDeadlockDetectionTimeout or TransactionInactiveTimeout value, the transaction automatically retries slave_transaction_retries times before stopping with an error. The default value is 10. Starting from MySQL 5.0.4, the total retry count can be seen in the output of SHOW STATUS; see Section 5.2.5, “Status Variables”.

15.3.1.16. Replication and Time Zones

If the master uses MySQL 4.1, the same system time zone should be set for both master and slave. Otherwise some statements will not be replicated properly, such as statements that use the NOW() or FROM_UNIXTIME() functions. You can set the time zone in which MySQL server runs by using the --timezone=timezone_name option of the mysqld_safe script or by setting the TZ environment variable. Both master and slave should also have the same default connection time zone setting; that is, the --default-time-zone parameter should have the same value for both master and slave. Note that this is not necessary when the master is MySQL 5.0 or later.

CONVERT_TZ(...,...,@@global.time_zone) is not properly replicated. CONVERT_TZ(...,...,@@session.time_zone) is properly replicated only if the master and slave are from MySQL 5.0.4 or newer.

15.3.1.17. Replication and Transactions

It is possible to replicate transactional tables on the master using non-transactional tables on the slave. For example, you can replicate an InnoDB master table as a MyISAM slave table. However, there are issues that you should consider before you do this:

  • There are problems if the slave is stopped in the middle of a BEGIN/COMMIT block because the slave restarts at the beginning of the BEGIN block.

  • When the storage engine type of the slave is non-transactional, transactions on the master that mix updates of transactional and non-transactional tables should be avoided because they can cause inconsistency of the data between the master's transactional table and the slave's non-transactional table. That is, such transactions can lead to master storage engine-specific behavior with the possible effect of replication going out of synchrony. MySQL does not issue a warning about this currently, so extra care should be taken when replicating transactional tables from the master to non-transactional ones on the slaves.

Due to the non-transactional nature of MyISAM tables, it is possible to have a statement that only partially updates a table and returns an error code. This can happen, for example, on a multiple-row insert that has one row violating a key constraint, or if a long update statement is killed after updating some of the rows. If that happens on the master, the slave thread exits and waits for the database administrator to decide what to do about it unless the error code is legitimate and execution of the statement results in the same error code on the slave. If this error code validation behavior is not desirable, some or all errors can be masked out (ignored) with the --slave-skip-errors option.

If you update transactional tables from non-transactional tables inside a BEGIN/COMMIT sequence, updates to the binary log may be out of synchrony with table states if the non-transactional table is updated before the transaction commits. This occurs because the transaction is written to the binary log only when it is committed.

In situations where transactions mix updates to transactional and non-transactional tables, the order of statements in the binary log is correct, and all needed statements are written to the binary log even in case of a ROLLBACK. However, when a second connection updates the non-transactional table before the first connection's transaction is complete, statements can be logged out of order, because the second connection's update is written immediately after it is performed, regardless of the state of the transaction being performed by the first connection.

Caution

You should not use transactions in a replication environment that update both transactional and non-transactional tables.

15.3.1.18. Replication and Triggers

Known issue: In MySQL 5.0.17, the syntax for CREATE TRIGGER changed to include a DEFINER clause for specifying which access privileges to check at trigger invocation time. (See Section 19.1, “CREATE TRIGGER Syntax”, for more information.) However, if you attempt to replicate from a master server older than MySQL 5.0.17 to a slave running MySQL 5.0.17 through 5.0.19, replication of CREATE TRIGGER statements fails on the slave with a Definer not fully qualified error. A workaround is to create triggers on the master using a version-specific comment embedded in each CREATE TRIGGER statement:

CREATE /*!50017 DEFINER = 'root'@'localhost' */ TRIGGER ... ;

CREATE TRIGGER statements written this way will replicate to newer slaves, which pick up the DEFINER clause from the comment and execute successfully.

This slave problem is fixed as of MySQL 5.0.20.

15.3.1.19. Replication and User Privileges

User privileges are replicated only if the mysql database is replicated. That is, the GRANT, REVOKE, SET PASSWORD, CREATE USER, and DROP USER statements take effect on the slave only if the replication setup includes the mysql database.

If you're replicating all databases, but don't want statements that affect user privileges to be replicated, set up the slave to not replicate the mysql database, using the --replicate-wild-ignore-table=mysql.% option. The slave will recognize that issuing privilege-related SQL statements won't have an effect, and thus not execute those statements.

15.3.1.20. Replication and Variables

The FOREIGN_KEY_CHECKS, SQL_MODE, UNIQUE_CHECKS, and SQL_AUTO_IS_NULL variables are all replicated in MySQL 5.0. The storage_engine system variable (also known as table_type) is not yet replicated, which is a good thing for replication between different storage engines.

Starting from MySQL 5.0.3 (master and slave), replication works even if the master and slave have different global character set variables. Starting from MySQL 5.0.4 (master and slave), replication works even if the master and slave have different global time zone variables.

Session variables are not replicated properly when used in statements that update tables. For example, SET MAX_JOIN_SIZE=1000 followed by INSERT INTO mytable VALUES(@@MAX_JOIN_SIZE) will not insert the same data on the master and the slave. This does not apply to the common sequence of SET TIME_ZONE=... followed by INSERT INTO mytable VALUES(CONVERT_TZ(...,...,@@time_zone)), which replicates correctly as of MySQL 5.0.4.

Update statements that refer to user-defined variables (that is, variables of the form @var_name) are replicated correctly in MySQL 5.0. However, this is not true for versions prior to 4.1. Note that user variable names are case insensitive starting in MySQL 5.0. You should take this into account when setting up replication between MySQL 5.0 and older versions.

15.3.1.21. Replication and Views

Views are always replicated to slaves. Views are filtered by their own name, not by the tables they refer to. This means that a view can be replicated to the slave even if the view contains a table that would normally be filtered out by replication-ignore-table rules. Care should therefore be taken to ensure that views do not replicate table data that would normally be filtered for security reasons.

15.3.2. Replication Compatibility Between MySQL Versions

The binary log format as implemented in MySQL 5.0 is considerably different from that used in previous versions. Major changes were made in MySQL 5.0.3 (for improvements to handling of character sets and LOAD DATA INFILE) and 5.0.4 (for improvements to handling of time zones).

We recommend using the most recent MySQL version available because replication capabilities are continually being improved. We also recommend using the same version for both the master and the slave. We recommend upgrading masters and slaves running alpha or beta versions to new (production) versions. Replication from a 5.0.3 master to a 5.0.2 slave will fail; from a 5.0.4 master to a 5.0.3 slave will also fail.

In general (but not always), slaves running MySQL 5.0.x may be used with older masters, but not the reverse. For more information on potential issues, see Section 15.3.1, “Replication Features and Issues”.

Note

You cannot replicate from a master that uses a newer binary log format to a slave that uses an older format (for example, from MySQL 5.0 to MySQL 4.1.) This has significant implications for upgrading replication servers, as described in Section 15.3.3, “Upgrading a Replication Setup”.

The preceding information pertains to replication compatibility at the protocol level. However, there can be other constraints, such as SQL-level compatibility issues. For example, a 5.0 master cannot replicate to a 4.1 slave if the replicated statements use SQL features available in 5.0 but not in 4.1. These and other issues are discussed in Section 15.3.1, “Replication Features and Issues”.

15.3.3. Upgrading a Replication Setup

When you upgrade servers that participate in a replication setup, the procedure for upgrading depends on the current server versions and the version to which you are upgrading.

This section applies to upgrading replication from MySQL 3.23, 4.0, or 4.1 to MySQL 5.0. A 4.0 server should be 4.0.3 or newer.

When you upgrade a master to 5.0 from an earlier MySQL release series, you should first ensure that all the slaves of this master are using the same 5.0.x release. If this is not the case, you should first upgrade the slaves. To upgrade each slave, shut it down, upgrade it to the appropriate 5.0.x version, restart it, and restart replication. The 5.0 slave is able to read the old relay logs written prior to the upgrade and to execute the statements they contain. Relay logs created by the slave after the upgrade are in 5.0 format.

After the slaves have been upgraded, shut down the master, upgrade it to the same 5.0.x release as the slaves, and restart it. The 5.0 master is able to read the old binary logs written prior to the upgrade and to send them to the 5.0 slaves. The slaves recognize the old format and handle it properly. Binary logs created by the master following the upgrade are in 5.0 format. These too are recognized by the 5.0 slaves.

In other words, there are no measures to take when upgrading to MySQL 5.0, except that the slaves must be MySQL 5.0 before you can upgrade the master to 5.0. Note that downgrading from 5.0 to older versions does not work so simply: You must ensure that any 5.0 binary logs or relay logs have been fully processed, so that you can remove them before proceeding with the downgrade.

15.3.4. Replication FAQ

Questions

  • 16.3.4.1: How do I configure a slave if the master is running and I do not want to stop it?

  • 16.3.4.2: Does the slave need to be connected to the master all the time?

  • 16.3.4.3: How do I know how late a slave is compared to the master? In other words, how do I know the date of the last statement replicated by the slave?

  • 16.3.4.4: How do I force the master to block updates until the slave catches up?

  • 16.3.4.5: What issues should I be aware of when setting up two-way replication?

  • 16.3.4.6: How can I use replication to improve performance of my system?

  • 16.3.4.7: What should I do to prepare client code in my own applications to use performance-enhancing replication?

  • 16.3.4.8: When and how much can MySQL replication improve the performance of my system?

  • 16.3.4.9: How do I prevent GRANT and REVOKE statements from replicating to slave machines?

  • 16.3.4.10: Does replication work on mixed operating systems (for example, the master runs on Linux while slaves run on Mac OS X and Windows)?

  • 16.3.4.11: Does replication work on mixed hardware architectures (for example, the master runs on a 64-bit machine while slaves run on 32-bit machines)?

Questions and Answers

16.3.4.1: How do I configure a slave if the master is running and I do not want to stop it?

There are several possibilities. If you have taken a snapshot backup of the master at some point and recorded the binary log filename and offset (from the output of SHOW MASTER STATUS) corresponding to the snapshot, use the following procedure:

  1. Make sure that the slave is assigned a unique server ID.

  2. Execute the following statement on the slave, filling in appropriate values for each option:

    mysql> CHANGE MASTER TO
        ->     MASTER_HOST='master_host_name',
        ->     MASTER_USER='master_user_name',
        ->     MASTER_PASSWORD='master_pass',
        ->     MASTER_LOG_FILE='recorded_log_file_name',
        ->     MASTER_LOG_POS=recorded_log_position;
    
  3. Execute START SLAVE on the slave.

If you do not have a backup of the master server, here is a quick procedure for creating one. All steps should be performed on the master host.

  1. Issue this statement to acquire a global read lock:

    mysql> FLUSH TABLES WITH READ LOCK;
    
  2. With the lock still in place, execute this command (or a variation of it):

    shell> tar zcf /tmp/backup.tar.gz /var/lib/mysql
    
  3. Issue this statement and record the output, which you will need later:

    mysql> SHOW MASTER STATUS;
    
  4. Release the lock:

    mysql> UNLOCK TABLES;
    

An alternative to using the preceding procedure to make a binary copy is to make an SQL dump of the master. To do this, you can use mysqldump --master-data on your master and later load the SQL dump into your slave. However, this is slower than making a binary copy.

Regardless of which of the two methods you use, afterward follow the instructions for the case when you have a snapshot and have recorded the log filename and offset. You can use the same snapshot to set up several slaves. Once you have the snapshot of the master, you can wait to set up a slave as long as the binary logs of the master are left intact. The two practical limitations on the length of time you can wait are the amount of disk space available to retain binary logs on the master and the length of time it takes the slave to catch up.

16.3.4.2: Does the slave need to be connected to the master all the time?

No, it does not. The slave can go down or stay disconnected for hours or even days, and then reconnect and catch up on updates. For example, you can set up a master/slave relationship over a dial-up link where the link is up only sporadically and for short periods of time. The implication of this is that, at any given time, the slave is not guaranteed to be in synchrony with the master unless you take some special measures.

16.3.4.3: How do I know how late a slave is compared to the master? In other words, how do I know the date of the last statement replicated by the slave?

You can read the Seconds_Behind_Master column in SHOW SLAVE STATUS. See Section 15.4.1, “Replication Implementation Details”.

When the slave SQL thread executes an event read from the master, it modifies its own time to the event timestamp. (This is why TIMESTAMP is well replicated.) In the Time column in the output of SHOW PROCESSLIST, the number of seconds displayed for the slave SQL thread is the number of seconds between the timestamp of the last replicated event and the real time of the slave machine. You can use this to determine the date of the last replicated event. Note that if your slave has been disconnected from the master for one hour, and then reconnects, you may immediately see Time values like 3600 for the slave SQL thread in SHOW PROCESSLIST. This is because the slave is executing statements that are one hour old.

16.3.4.4: How do I force the master to block updates until the slave catches up?

Use the following procedure:

  1. On the master, execute these statements:

    mysql> FLUSH TABLES WITH READ LOCK;
    mysql> SHOW MASTER STATUS;
    

    Record the replication coordinates (the log filename and offset) from the output of the SHOW statement.

  2. On the slave, issue the following statement, where the arguments to the MASTER_POS_WAIT() function are the replication coordinate values obtained in the previous step:

    mysql> SELECT MASTER_POS_WAIT('log_name', log_offset);
    

    The SELECT statement blocks until the slave reaches the specified log file and offset. At that point, the slave is in synchrony with the master and the statement returns.

  3. On the master, issue the following statement to allow the master to begin processing updates again:

    mysql> UNLOCK TABLES;
    

16.3.4.5: What issues should I be aware of when setting up two-way replication?

MySQL replication currently does not support any locking protocol between master and slave to guarantee the atomicity of a distributed (cross-server) update. In other words, it is possible for client A to make an update to co-master 1, and in the meantime, before it propagates to co-master 2, client B could make an update to co-master 2 that makes the update of client A work differently than it did on co-master 1. Thus, when the update of client A makes it to co-master 2, it produces tables that are different from what you have on co-master 1, even after all the updates from co-master 2 have also propagated. This means that you should not chain two servers together in a two-way replication relationship unless you are sure that your updates can safely happen in any order, or unless you take care of mis-ordered updates somehow in the client code.

You should also realize that two-way replication actually does not improve performance very much (if at all) as far as updates are concerned. Each server must do the same number of updates, just as you would have a single server do. The only difference is that there is a little less lock contention, because the updates originating on another server are serialized in one slave thread. Even this benefit might be offset by network delays.

16.3.4.6: How can I use replication to improve performance of my system?

You should set up one server as the master and direct all writes to it. Then configure as many slaves as you have the budget and rackspace for, and distribute the reads among the master and the slaves. You can also start the slaves with the --skip-innodb, --skip-bdb, --low-priority-updates, and --delay-key-write=ALL options to get speed improvements on the slave end. In this case, the slave uses non-transactional MyISAM tables instead of InnoDB and BDB tables to get more speed by eliminating transactional overhead.

16.3.4.7: What should I do to prepare client code in my own applications to use performance-enhancing replication?

If the part of your code that is responsible for database access has been properly abstracted/modularized, converting it to run with a replicated setup should be very smooth and easy. Change the implementation of your database access to send all writes to the master, and to send reads to either the master or a slave. If your code does not have this level of abstraction, setting up a replicated system gives you the opportunity and motivation to it clean up. Start by creating a wrapper library or module that implements the following functions:

  • safe_writer_connect()

  • safe_reader_connect()

  • safe_reader_statement()

  • safe_writer_statement()

safe_ in each function name means that the function takes care of handling all error conditions. You can use different names for the functions. The important thing is to have a unified interface for connecting for reads, connecting for writes, doing a read, and doing a write.

Then convert your client code to use the wrapper library. This may be a painful and scary process at first, but it pays off in the long run. All applications that use the approach just described are able to take advantage of a master/slave configuration, even one involving multiple slaves. The code is much easier to maintain, and adding troubleshooting options is trivial. You need modify only one or two functions; for example, to log how long each statement took, or which statement among those issued gave you an error.

If you have written a lot of code, you may want to automate the conversion task by using the replace utility that comes with standard MySQL distributions, or write your own conversion script. Ideally, your code uses consistent programming style conventions. If not, then you are probably better off rewriting it anyway, or at least going through and manually regularizing it to use a consistent style.

16.3.4.8: When and how much can MySQL replication improve the performance of my system?

MySQL replication is most beneficial for a system that processes frequent reads and infrequent writes. In theory, by using a single-master/multiple-slave setup, you can scale the system by adding more slaves until you either run out of network bandwidth, or your update load grows to the point that the master cannot handle it.

To determine how many slaves you can use before the added benefits begin to level out, and how much you can improve performance of your site, you need to know your query patterns, and to determine empirically by benchmarking the relationship between the throughput for reads (reads per second, or reads) and for writes (writes) on a typical master and a typical slave. The example here shows a rather simplified calculation of what you can get with replication for a hypothetical system.

Let's say that system load consists of 10% writes and 90% reads, and we have determined by benchmarking that reads is 1200 – 2 × writes. In other words, the system can do 1,200 reads per second with no writes, the average write is twice as slow as the average read, and the relationship is linear. Let us suppose that the master and each slave have the same capacity, and that we have one master and N slaves. Then we have for each server (master or slave):

reads = 1200 – 2 × writes

reads = 9 × writes / (N + 1) (reads are split, but writes go to all servers)

9 × writes / (N + 1) + 2 × writes = 1200

writes = 1200 / (2 + 9/(N+1))

The last equation indicates the maximum number of writes for N slaves, given a maximum possible read rate of 1,200 per minute and a ratio of nine reads per write.

This analysis yields the following conclusions:

  • If N = 0 (which means we have no replication), our system can handle about 1200/11 = 109 writes per second.

  • If N = 1, we get up to 184 writes per second.

  • If N = 8, we get up to 400 writes per second.

  • If N = 17, we get up to 480 writes per second.

  • Eventually, as N approaches infinity (and our budget negative infinity), we can get very close to 600 writes per second, increasing system throughput about 5.5 times. However, with only eight servers, we increase it nearly four times.

Note that these computations assume infinite network bandwidth and neglect several other factors that could be significant on your system. In many cases, you may not be able to perform a computation similar to the one just shown that accurately predicts what will happen on your system if you add N replication slaves. However, answering the following questions should help you decide whether and by how much replication will improve the performance of your system:

  • What is the read/write ratio on your system?

  • How much more write load can one server handle if you reduce the reads?

  • For how many slaves do you have bandwidth available on your network?

16.3.4.9: How do I prevent GRANT and REVOKE statements from replicating to slave machines?

Start the server with the --replicate-wild-ignore-table=mysql.% option.

16.3.4.10: Does replication work on mixed operating systems (for example, the master runs on Linux while slaves run on Mac OS X and Windows)?

Yes.

16.3.4.11: Does replication work on mixed hardware architectures (for example, the master runs on a 64-bit machine while slaves run on 32-bit machines)?

Yes.

15.3.5. Troubleshooting Replication

If you have followed the instructions, and your replication setup is not working, the first thing to do is check the error log for messages. Many users have lost time by not doing this soon enough after encountering problems.

If you cannot tell from the error log what the problem was, try the following techniques:

  • Verify that the master has binary logging enabled by issuing a SHOW MASTER STATUS statement. If logging is enabled, Position is non-zero. If binary logging is not enabled, verify that you are running the master with the --log-bin and --server-id options.

  • Verify that the slave is running. Use SHOW SLAVE STATUS to check whether the Slave_IO_Running and Slave_SQL_Running values are both Yes. If not, verify the options that were used when starting the slave server. For example, --skip-slave-start prevents the slave threads from starting until you issue a START SLAVE statement.

  • If the slave is running, check whether it established a connection to the master. Use SHOW PROCESSLIST, find the I/O and SQL threads and check their State column to see what they display. See Section 15.4.1, “Replication Implementation Details”. If the I/O thread state says Connecting to master, verify the privileges for the replication user on the master, the master hostname, your DNS setup, whether the master is actually running, and whether it is reachable from the slave.

  • If the slave was running previously but has stopped, the reason usually is that some statement that succeeded on the master failed on the slave. This should never happen if you have taken a proper snapshot of the master, and never modified the data on the slave outside of the slave thread. If the slave stops unexpectedly, it is a bug or you have encountered one of the known replication limitations described in Section 15.3.1, “Replication Features and Issues”. If it is a bug, see Section 15.3.6, “How to Report Replication Bugs or Problems”, for instructions on how to report it.

    MySQL Enterprise For immediate notification whenever a slave stops, subscribe to the MySQL Enterprise Monitor. For more information see http://www.mysql.com/products/enterprise/advisors.html.

  • If a statement that succeeded on the master refuses to run on the slave, try the following procedure if it is not feasible to do a full database resynchronization by deleting the slave's databases and copying a new snapshot from the master:

    1. Determine whether the affected table on the slave is different from the master table. Try to understand how this happened. Then make the slave's table identical to the master's and run START SLAVE.

    2. If the preceding step does not work or does not apply, try to understand whether it would be safe to make the update manually (if needed) and then ignore the next statement from the master.

    3. If you decide that you can skip the next statement from the master, issue the following statements:

      mysql> SET GLOBAL SQL_SLAVE_SKIP_COUNTER = N;
      mysql> START SLAVE;
      

      The value of N should be 1 if the next statement from the master does not use AUTO_INCREMENT or LAST_INSERT_ID(). Otherwise, the value should be 2. The reason for using a value of 2 for statements that use AUTO_INCREMENT or LAST_INSERT_ID() is that they take two events in the binary log of the master.

    4. If you are sure that the slave started out perfectly synchronized with the master, and that no one has updated the tables involved outside of the slave thread, then presumably the discrepancy is the result of a bug. If you are running the most recent version of MySQL, please report the problem. If you are running an older version, try upgrading to the latest production release to determine whether the problem persists.

15.3.6. How to Report Replication Bugs or Problems

When you have determined that there is no user error involved, and replication still either does not work at all or is unstable, it is time to send us a bug report. We need to obtain as much information as possible from you to be able to track down the bug. Please spend some time and effort in preparing a good bug report.

If you have a repeatable test case that demonstrates the bug, please enter it into our bugs database using the instructions given in Section 1.7, “How to Report Bugs or Problems”. If you have a “phantom” problem (one that you cannot duplicate at will), use the following procedure:

  1. Verify that no user error is involved. For example, if you update the slave outside of the slave thread, the data goes out of synchrony, and you can have unique key violations on updates. In this case, the slave thread stops and waits for you to clean up the tables manually to bring them into synchrony. This is not a replication problem. It is a problem of outside interference causing replication to fail.

  2. Run the slave with the --log-slave-updates and --log-bin options. These options cause the slave to log the updates that it receives from the master into its own binary logs.

  3. Save all evidence before resetting the replication state. If we have no information or only sketchy information, it becomes difficult or impossible for us to track down the problem. The evidence you should collect is:

    • All binary logs from the master

    • All binary logs from the slave

    • The output of SHOW MASTER STATUS from the master at the time you discovered the problem

    • The output of SHOW SLAVE STATUS from the slave at the time you discovered the problem

    • Error logs from the master and the slave

  4. Use mysqlbinlog to examine the binary logs. The following should be helpful to find the problem statement. log_pos and log_file are the Master_Log_File and Read_Master_Log_Pos values from SHOW SLAVE STATUS.

    shell> mysqlbinlog -j log_pos log_file | head
    

After you have collected the evidence for the problem, try to isolate it as a separate test case first. Then enter the problem with as much information as possible into our bugs database using the instructions at Section 1.7, “How to Report Bugs or Problems”.

15.4. Replication Implementation Overview

MySQL replication is based on the master server keeping track of all changes to your databases (updates, deletes, and so on) in its binary logs. Therefore, to use replication, you must enable binary logging on the master server. See Section 5.11.3, “The Binary Log”.

Each slave server receives from the master the saved updates that the master has recorded in its binary log, so that the slave can execute the same updates on its copy of the data.

It is extremely important to realize that the binary log is simply a record starting from the fixed point in time at which you enable binary logging. Any slaves that you set up need copies of the databases on your master as they existed at the moment you enabled binary logging on the master. If you start your slaves with databases that are not in the same state as those on the master when the binary log was started, your slaves are quite likely to fail.

After the slave has been set up with a copy of the master's data, it connects to the master and waits for updates to process. If the master fails, or the slave loses connectivity with your master, the slave keeps trying to connect periodically until it is able to resume listening for updates. The --master-connect-retry option controls the retry interval. The default is 60 seconds.

Each slave keeps track of where it left off when it last read from its master server. The master has no knowledge of how many slaves it has or which ones are up to date at any given time.

15.4.1. Replication Implementation Details

MySQL replication capabilities are implemented using three threads (one on the master server and two on the slave). When a START SLAVE statement is issued on a slave server, the slave creates an I/O thread, which connects to the master and asks it to send the updates recorded in its binary logs. The master creates a thread to send the binary log contents to the slave. This thread can be identified as the Binlog Dump thread in the output of SHOW PROCESSLIST on the master. The slave I/O thread reads the updates that the master Binlog Dump thread sends and copies them to local files, known as relay logs, in the slave's data directory. The third thread is the SQL thread, which the slave creates to read the relay logs and to execute the updates they contain.

MySQL Enterprise For constant monitoring of the status of slaves subscribe to the MySQL Enterprise Monitor. For more information see http://www.mysql.com/products/enterprise/advisors.html.

In the preceding description, there are three threads per master/slave connection. A master that has multiple slaves creates one thread for each currently-connected slave, and each slave has its own I/O and SQL threads.

The slave uses two threads so that reading updates from the master and executing them can be separated into two independent tasks. Thus, the task of reading statements is not slowed down if statement execution is slow. For example, if the slave server has not been running for a while, its I/O thread can quickly fetch all the binary log contents from the master when the slave starts, even if the SQL thread lags far behind. If the slave stops before the SQL thread has executed all the fetched statements, the I/O thread has at least fetched everything so that a safe copy of the statements is stored locally in the slave's relay logs, ready for execution the next time that the slave starts. This enables the master server to purge its binary logs sooner because it no longer needs to wait for the slave to fetch their contents.

The SHOW PROCESSLIST statement provides information that tells you what is happening on the master and on the slave regarding replication. See Section 6.5.5, “Examining Thread Information”, for descriptions of all replicated-related states.

The following example illustrates how the three threads show up in the output from SHOW PROCESSLIST.

On the master server, the output from SHOW PROCESSLIST looks like this:

mysql> SHOW PROCESSLIST\G
*************************** 1. row ***************************
     Id: 2
   User: root
   Host: localhost:32931
     db: NULL
Command: Binlog Dump
   Time: 94
  State: Has sent all binlog to slave; waiting for binlog to
         be updated
   Info: NULL

Here, thread 2 is a Binlog Dump replication thread for a connected slave. The State information indicates that all outstanding updates have been sent to the slave and that the master is waiting for more updates to occur. If you see no Binlog Dump threads on a master server, this means that replication is not running — that is, that no slaves are currently connected.

On the slave server, the output from SHOW PROCESSLIST looks like this:

mysql> SHOW PROCESSLIST\G
*************************** 1. row ***************************
     Id: 10
   User: system user
   Host:
     db: NULL
Command: Connect
   Time: 11
  State: Waiting for master to send event
   Info: NULL
*************************** 2. row ***************************
     Id: 11
   User: system user
   Host:
     db: NULL
Command: Connect
   Time: 11
  State: Has read all relay log; waiting for the slave I/O
         thread to update it
   Info: NULL

This information indicates that thread 10 is the I/O thread that is communicating with the master server, and thread 11 is the SQL thread that is processing the updates stored in the relay logs. At the time that the SHOW PROCESSLIST was run, both threads were idle, waiting for further updates.

The value in the Time column can show how late the slave is compared to the master. See Section 15.3.4, “Replication FAQ”.

15.4.2. Replication Relay and Status Files

By default, relay logs filenames have the form host_name-relay-bin.nnnnnn, where host_name is the name of the slave server host and nnnnnn is a sequence number. Successive relay log files are created using successive sequence numbers, beginning with 000001. The slave uses an index file to track the relay log files currently in use. The default relay log index filename is host_name-relay-bin.index. By default, the slave server creates relay log files in its data directory. The default filenames can be overridden with the --relay-log and --relay-log-index server options. See Section 15.1.2, “Replication Startup Options and Variables”.

Relay logs have the same format as binary logs and can be read using mysqlbinlog. The SQL thread automatically deletes each relay log file as soon as it has executed all events in the file and no longer needs it. There is no explicit mechanism for deleting relay logs because the SQL thread takes care of doing so. However, FLUSH LOGS rotates relay logs, which influences when the SQL thread deletes them.

A slave server creates a new relay log file under the following conditions:

  • Each time the I/O thread starts.

  • When the logs are flushed; for example, with FLUSH LOGS or mysqladmin flush-logs.

  • When the size of the current relay log file becomes too large. The meaning of “too large” is determined as follows:

    • If the value of max_relay_log_size is greater than 0, that is the maximum relay log file size.

    • If the value of max_relay_log_size is 0, max_binlog_size determines the maximum relay log file size.

A slave replication server creates two additional small files in the data directory. These status files are named master.info and relay-log.info by default. Their names can be changed by using the --master-info-file and --relay-log-info-file options. See Section 15.1.2, “Replication Startup Options and Variables”.

The two status files contain information like that shown in the output of the SHOW SLAVE STATUS statement, which is discussed in Section 12.6.2, “SQL Statements for Controlling Slave Servers”. Because the status files are stored on disk, they survive a slave server's shutdown. The next time the slave starts up, it reads the two files to determine how far it has proceeded in reading binary logs from the master and in processing its own relay logs.

The I/O thread updates the master.info file. The following table shows the correspondence between the lines in the file and the columns displayed by SHOW SLAVE STATUS.

LineDescription
1Number of lines in the file
2Master_Log_File
3Read_Master_Log_Pos
4Master_Host
5Master_User
6Password (not shown by SHOW SLAVE STATUS)
7Master_Port
8Connect_Retry
9Master_SSL_Allowed
10Master_SSL_CA_File
11Master_SSL_CA_Path
12Master_SSL_Cert
13Master_SSL_Cipher
14Master_SSL_Key

The SQL thread updates the relay-log.info file. The following table shows the correspondence between the lines in the file and the columns displayed by SHOW SLAVE STATUS.

LineDescription
1Relay_Log_File
2Relay_Log_Pos
3Relay_Master_Log_File
4Exec_Master_Log_Pos

The contents of the relay-log.info file and the states shown by the SHOW SLAVE STATES command may not match if the relay-log.info file has not been flushed to disk. Ideally, you should only view relay-log.info on a slave that is offline (i.e. mysqld is not running). For a running system, SHOW SLAVE STATUS should be used.

When you back up the slave's data, you should back up these two status files as well, along with the relay log files. They are needed to resume replication after you restore the slave's data. If you lose the relay logs but still have the relay-log.info file, you can check it to determine how far the SQL thread has executed in the master binary logs. Then you can use CHANGE MASTER TO with the MASTER_LOG_FILE and MASTER_LOG_POS options to tell the slave to re-read the binary logs from that point. Of course, this requires that the binary logs still exist on the master server.

If your slave is subject to replicating LOAD DATA INFILE statements, you should also back up any SQL_LOAD-* files that exist in the directory that the slave uses for this purpose. The slave needs these files to resume replication of any interrupted LOAD DATA INFILE operations. The directory location is specified using the --slave-load-tmpdir option. If this option is not specified, the directory location is the value of the tmpdir system variable.

15.4.3. How Servers Evaluate Replication Rules

If a master server does not write a statement to its binary log, the statement is not replicated. If the server does log the statement, the statement is sent to all slaves and each slave determines whether to execute it or ignore it.

On the master side, decisions about which statements to log are based on the --binlog-do-db and --binlog-ignore-db options that control binary logging. For a description of the rules that servers use in evaluating these options, see Section 5.11.3, “The Binary Log”.

On the slave side, decisions about whether to execute or ignore statements received from the master are made according to the --replicate-* options that the slave was started with. (See Section 15.1.2, “Replication Startup Options and Variables”.) The slave evaluates these options using the following procedure, which first checks the database-level options and then the table-level options.

In the simplest case, when there are no --replicate-* options, the procedure yields the result that the slave executes all statements that it receives from the master. Otherwise, the result depends on the particular options given. In general, to make it easier to determine what effect an option set will have, it is recommended that you avoid mixing “do” and “ignore” options, or wildcard and non-wildcard options.

Stage 1. Check the database options.

At this stage, the slave checks whether there are any --replicate-do-db or --replicate-ignore-db options that specify database-specific conditions:

  • No: Permit the statement and proceed to the table-checking stage.

  • Yes: Test the options using the same rules as for the --binlog-do-db and --binlog-ignore-db options to determine whether to permit or ignore the statement. What is the result of the test?

    • Permit: Do not execute the statement immediately. Defer the decision and proceed to the table-checking stage.

    • Ignore: Ignore the statement and exit.

This stage can permit a statement for further option-checking, or cause it to be ignored. However, statements that are permitted at this stage are not actually executed yet. Instead, they pass to the following stage that checks the table options.

Stage 2. Check the table options.

First, as a preliminary condition, the slave checks whether the statement occurs within a stored function or (prior to MySQL 5.0.12) a stored procedure. If so, execute the statement and exit. (Stored procedures are exempt from this test as of MySQL 5.0.12 because procedure logging occurs at the level of statements that are executed within the routine rather than at the CALL level.)

Next, the slave checks for table options and evaluates them. If the server reaches this point, it executes all statements if there are no table options. If there are “do” table options, the statement must match one of them if it is to be executed; otherwise, it is ignored. If there are any “ignore” options, all statements are executed except those that match any ignore option. The following steps describe how this evaluation occurs in more detail.

  1. Are there any --replicate-*-table options?

    • No: There are no table restrictions, so all statements match. Execute the statement and exit.

    • Yes: There are table restrictions. Evaluate the tables to be updated against them. There might be multiple tables to update, so loop through the following steps for each table looking for a matching option (first the non-wild options, and then the wild options). Only tables that are to be updated are compared to the options. For example, if the statement is INSERT INTO sales SELECT * FROM prices, only sales is compared to the options). If several tables are to be updated (multiple-table statement), the first table that matches “do” or “ignore” wins. That is, the server checks the first table against the options. If no decision could be made, it checks the second table against the options, and so on.

  2. Are there any --replicate-do-table options?

    • No: Proceed to the next step.

    • Yes: Does the table match any of them?

      • No: Proceed to the next step.

      • Yes: Execute the statement and exit.

  3. Are there any --replicate-ignore-table options?

    • No: Proceed to the next step.

    • Yes: Does the table match any of them?

      • No: Proceed to the next step.

      • Yes: Ignore the statement and exit.

  4. Are there any --replicate-wild-do-table options?

    • No: Proceed to the next step.

    • Yes: Does the table match any of them?

      • No: Proceed to the next step.

      • Yes: Execute the statement and exit.

  5. Are there any --replicate-wild-ignore-table options?

    • No: Proceed to the next step.

    • Yes: Does the table match any of them?

      • No: Proceed to the next step.

      • Yes: Ignore the statement and exit.

  6. No --replicate-*-table option was matched. Is there another table to test against these options?

    • No: We have now tested all tables to be updated and could not match any option. Are there --replicate-do-table or --replicate-wild-do-table options?

      • No: There were no “do” table options, so no explicit “do” match is required. Execute the statement and exit.

      • Yes: There were “do” table options, so the statement is executed only with an explicit match to one of them. Ignore the statement and exit.

    • Yes: Loop.

Examples:

  • No --replicate-* options at all

    The slave executes all statements that it receives from the master.

  • --replicate-*-db options, but no table options

    The slave permits or ignores statements using the database options. Then it executes all statements permitted by those options because there are no table restrictions.

  • --replicate-*-table options, but no database options

    All statements are permitted at the database-checking stage because there are no database conditions. The slave executes or ignores statements based on the table options.

  • A mix of database and table options

    The slave permits or ignores statements using the database options. Then it evaluates all statements permitted by those options according to the table options. In some cases, this process can yield what might seem a counterintuitive result. Consider the following set of options:

    [mysqld]
    replicate-do-db    = db1
    replicate-do-table = db2.mytbl2
    

    Suppose that db1 is the default database and the slave receives this statement:

    INSERT INTO mytbl1 VALUES(1,2,3);
    

    The database is db1, which matches the --replicate-do-db option at the database-checking stage. The algorithm then proceeds to the table-checking stage. If there were no table options, the statement would be executed. However, because the options include a “do” table option, the statement must match if it is to be executed. The statement does not match, so it is ignored. (The same would happen for any table in db1.)