More DBA job interview questions and answers at
http://dba.fyicenter.com/Interview-Questions/
(Continued from previous question...)
MySQL - Estimating Query Performance
In most cases you can estimate the performance by counting disk seeks. For small tables, you can usually find the row in 1
disk seek (as the index is probably cached). For bigger tables, you can estimate that (using B++ tree indexes) you will
need: log(row_count) / log(index_block_length / 3 * 2 / (index_length + data_pointer_length)) + 1 seeks to find a row.
In MySQL an index block is usually 1024 bytes and the data pointer is usually 4 bytes. A 500,000 row table with an index
length of 3 (medium integer) gives you: log(500,000)/log(1024/3*2/(3+4)) + 1 = 4 seeks.
As the above index would require about 500,000 * 7 * 3/2 = 5.2M, (assuming that the index buffers are filled to 2/3, which
is typical) you will probably have much of the index in memory and you will probably only need 1-2 calls to read data from
the OS to find the row.
For writes, however, you will need 4 seek requests (as above) to find where to place the new index and normally 2 seeks to
update the index and write the row.
Note that the above doesn't mean that your application will slowly degenerate by N log N! As long as everything is cached by
the OS or SQL server things will only go marginally slower while the table gets bigger. After the data gets too big to be
cached, things will start to go much slower until your applications is only bound by disk-seeks (which increase by N log N).
To avoid this, increase the index cache as the data grows.
(Continued on next question...)