DBA > Job Interview Questions > MySQL and SQL

MySQL - Replication Features and known problems

More DBA job interview questions and answers at http://dba.fyicenter.com/Interview-Questions/

(Continued from previous question...)

MySQL - Replication Features and known problems

Below is an explanation of what is supported and what is not:

Replication will be done correctly with AUTO_INCREMENT, LAST_INSERT_ID, and TIMESTAMP values.
RAND() in updates does not replicate properly. Use RAND(some_non_rand_expr) if you are replcating updates with RAND(). You can, for example, use UNIX_TIMESTAMP() for the argument to RAND().
LOAD DATA INFILE will be handled properly as long as the file still resides on the master server at the time of update propagation. LOAD LOCAL DATA INFILE will be skipped.
Update queries that use user variables are not replication-safe (yet).
Temporary tables starting in 3.23.29 are replicated properly with the exception of the case when you shut down slave server ( not just slave thread), you have some temporary tables open, and the are used in subsequent updates. To deal with this problem, to shut down the slave, do SLAVE STOP, then check Slave_open_temp_tables variable to see if it is 0, then issue mysqladmin shutdown. If the number is not 0, restart the slave thread with SLAVE START and see if you have better luck next time. There will be a cleaner solution, but it has to wait until version 4.0. In earlier versions temporary tables are not being replicated properly - we recommend that you either upgrade, or execute SET SQL_LOG_BIN=0 on your clients before all queries with temp tables.
MySQL only supports one master and many slaves. We will in 4.x add a voting algorithm to automaticly change master if something goes wrong with the current master. We will also introduce 'agent' processes to help doing load balancing by sending select queries to different slaves.
Starting in Version 3.23.26, it is safe to connect servers in a circular master-slave relationship with log-slave-updates enabled. Note, however, that many queries will not work right in this kind of setup unless your client code is written to take care of the potential problems that can happen from updates that occur in different sequence on different servers. Note that the log format has changed in Version 3.23.26 so that pre-3.23.26 slaves will not be able to read it.
If the query on the slave gets an error, the slave thread will terminate, and a message will appear in the .err file. You should then connect to the slave manually, fix the cause of the error (for example, non-existent table), and then run SLAVE START sql command (available starting in Version 3.23.16). In Version 3.23.15, you will have to restart the server.
If connection to the master is lost, the slave will retry immediately, and then in case of failure every master-connect-retry (default 60) seconds. Because of this, it is safe to shut down the master, and then restart it after a while. The slave will also be able to deal with network connectivity outages.
Shutting down the slave (cleanly) is also safe, as it keeps track of where it left off. Unclean shutdowns might produce problems, especially if disk cache was not synced before the system died. Your system fault tolerance will be greatly increased if you have a good UPS.
If the master is listening on a non-standard port, you will also need to specify this with master-port parameter in my.cnf . In Version 3.23.15, all of the tables and databases will be replicated. Starting in Version 3.23.16, you can restrict replication to a set of databases with replicate-do-db directives in my.cnf or just exclude a set of databases with replicate-ignore-db. Note that up until Version 3.23.23, there was a bug that did not properly deal with LOAD DATA INFILE if you did it in a database that was excluded from replication.
Starting in Version 3.23.16, SET SQL_LOG_BIN = 0 will turn off replication (binary) logging on the master, and SET SQL_LOG_BIN = 1 will turn in back on - you must have the process privilege to do this.
Starting in Version 3.23.19, you can clean up stale replication leftovers when something goes wrong and you want a clean start with FLUSH MASTER and FLUSH SLAVE commands. In Version 3.23.26 we have renamed them to RESET MASTER and RESET SLAVE respectively to clarify what they do. The old FLUSH variants still work, though, for compatibility.
Starting in Version 3.23.21, you can use LOAD TABLE FROM MASTER for network backup and to set up replication initially. We have recently received a number of bug reports concerning it that we are investigating, so we recommend that you use it only in testing until we make it more stable.
Starting in Version 3.23.23, you can change masters and adjust log position with CHANGE MASTER TO.
Starting in Version 3.23.23, you tell the master that updates in certain databases should not be logged to the binary log with binlog-ignore-db.
Starting in Version 3.23.26, you can use replicate-rewrite-db to tell the slave to apply updates from one database on the master to the one with a different name on the slave.
Starting in Version 3.23.28, you can use PURGE MASTER LOGS TO 'log-name' to get rid of old logs while the slave is running. 11.5 Replication Options in my.cnf
If you are using replication, we recommend you to use MySQL Version 3.23.30 or later. Older versions work, but they do have some bugs and are missing some features.

On both master and slave you need to use the server-id option. This sets an unique replication id. You should pick a unique value in the range between 1 to 2^32-1 for each master and slave. Example: server-id=3

The following table has the options you can use for the MASTER:

Option Description
log-bin=filename Write to a binary update log to the specified location. Note that if you give it a parameter with an extension (for example, log-bin=/mysql/logs/replication.log ) versions up to 3.23.24 will not work right during replication if you do FLUSH LOGS . The problem is fixed in Version 3.23.25. If you are using this kind of log name, FLUSH LOGS will be ignored on binlog. To clear the log, run FLUSH MASTER, and do not forget to run FLUSH SLAVE on all slaves. In Version 3.23.26 and in later versions you should use RESET MASTER and RESET SLAVE
log-bin-index=filename Because the user could issue the FLUSH LOGS command, we need to know which log is currently active and which ones have been rotated out and in what sequence. This information is stored in the binary log index file. The default is `hostname`.index. You can use this option if you want to be a rebel. (Example: log-bin-index=db.index) sql-bin-update-same If set, setting SQL_LOG_BIN to a value will automatically set SQL_LOG_UPDATE to the same value and vice versa.
binlog-do-db=database_name Tells the master it should log updates for the specified database, and exclude all others not explicitly mentioned. (Example: binlog-do-db=some_database)
binlog-ignore-db=database_name Tells the master that updates to the given database should not be logged to the binary log (Example: binlog-ignore-db=some_database)

The following table has the options you can use for the SLAVE:

Option Description
master-host=host Master hostname or IP address for replication. If not set, the slave thread will not be started. (Example: master-host=db-master.mycompany.com)
master-user=username The user the slave thread will us for authentication when connecting to the master. The user must have FILE privilege. If the master user is not set, user test is assumed. (Example: master-user=scott)
master-password=password The password the slave thread will authenticate with when connecting to the master. If not set, an empty password is assumed. (Example: master-password=tiger)
master-port=portnumber The port the master is listening on. If not set, the compiled setting of MYSQL_PORT is assumed. If you have not tinkered with configure options, this should be 3306. (Example: master-port=3306)
master-connect-retry=seconds The number of seconds the slave thread will sleep before retrying to connect to the master in case the master goes down or the connection is lost. Default is 60. (Example: master-connect-retry=60)
master-info-file=filename The location of the file that remembers where we left off on the master during the replication process. The default is master.info in the data directory. Sasha: The only reason I see for ever changing the default is the desire to be rebelious. (Example: master-info-file=master.info)
replicate-do-table=db_name.table_name Tells the slave thread to restrict replication to the specified database. To specify more than one table, use the directive multiple times, once for each table. . (Example:
replicate-do-table=some_db.some_table)
replicate-ignore-table=db_name.table_name Tells the slave thread to not replicate to the specified table. To specify more than one table to ignore, use the directive multiple times, once for each table.(Example:
replicate-ignore-table=db_name.some_table)
replicate-wild-do-table=db_name.table_name Tells the slave thread to restrict replication to the tables that match the specified wildcard pattern. . To specify more than one table, use the directive multiple times, once for each table. . (Example: replicate-do-table=foo%.bar% will replicate only updates to tables in all databases that start with foo and whose table names start with bar)
replicate-wild-ignore-table=db_name.table_name Tells the slave thread to not replicate to the tables that match the given wild card pattern. To specify more than one table to ignore, use the directive multiple times, once for each table.(Example: replicate-ignore-table=foo%.bar% - will not upates to tables in all databases that start with foo and whose table names start with bar)
replicate-ignore-db=database_name Tells the slave thread to not replicate to the specified database. To specify more than one database to ignore, use the directive multiple times, once for each database. This option will not work if you use cross database updates. If you need cross database updates to work, make sure you have 3.23.28 or later, and use replicate-wild-ignore-table=db_name.%(Example: replicate-ignore-db=some_db)
replicate-do-db=database_name Tells the slave thread to restrict replication to the specified database. To specify more than one database, use the directive multiple times, once for each database. Note that this will only work if you do not use cross-database queries such as UPDATE some_db.some_table SET foo='bar' while having selected a different or no database. If you need cross database updates to work, make sure you have 3.23.28 or later, and use replicate-wild-do-table=db_name.% (Example: replicate-do-db=some_db)
log-slave-updates Tells the slave to log the updates from the slave thread to the binary log. Off by default. You will need to turn it on if you plan to daisy-chain the slaves.
replicate-rewrite-db=from_name->to_name Updates to a database with a different name than the original (Example: replicate-rewrite-db=master_db_name->slave_db_name
skip-slave-start Tells the slave server not to start the slave on the startup. The user can start it later with SLAVE START.

(Continued on next question...)

Other Job Interview Questions