More DBA job interview questions and answers at
http://dba.fyicenter.com/Interview-Questions/
(Continued from previous question...)
MySQL - When and how much can MySQL replication improve the performance of my system?
MySQL replication is most beneficial for a system with frequent reads and not so frequent writes. In theory, by using a
one master/many slaves setup you can scale by adding more slaves until you either run out of network bandwidth, or your
update load grows to the point that the master cannot handle it.
In order to determine how many slaves you can get before the added benefits begin to level out, and how much you can improve
performance of your site, you need to know your query patterns, and empirically (by benchmarking) determine the relationship
between the throughput on reads (reads per second, or max_reads) and on writes max_writes) on a typical master and a typical
slave. The example below will show you a rather simplified calculation of what you can get with replication for our imagined
system.
Let's say our system load consists of 10% writes and 90% reads, and we have determined that max_reads = 1200 - 2 *
max_writes, or in other words, our system can do 1200 reads per second with no writes, our average write is twice as slow as
average read, and the relationship is linear. Let us suppose that our master and slave are of the same capacity, and we have
N slaves and 1 master. Then we have for each server (master or slave):
reads = 1200 - 2 * writes (from bencmarks)
reads = 9* writes / (N + 1) (reads split, but writes go to all servers)
9*writes/(N+1) + 2 * writes = 1200
writes = 1200/(2 + 9/(N+1)
So if N = 0, which means we have no replication, our system can handle 1200/11, about 109 writes per second (which means we
will have 9 times as many reads due to the nature of our application).
If N = 1, we can get up to 184 writes per second.
If N = 8, we get up to 400.
If N = 17, 480 writes.
Eventually as N approaches infinity (and our budget negative infinity), we can get very close to 600 writes per second,
increasing system throughput about 5.5 times. However, with only 8 servers, we increased it almost 4 times already.
Note that our computations assumed infinite network bandwidth, and neglected several other factors that could turn out to be
signficant on your system. In many cases, you may not be able to make a computation similar to the one above that will
accurately predict what will happen on your system if you add N replication slaves. However, answering the following
questions should help you decided whether and how much, if at all, the replication will improve the performance of your
system:
What is the read/write ratio on your system?
How much more write load can one server handle if you reduce the reads?
How many slaves do you have bandwidth for on your network?
(Continued on next question...)