As DBAs, there are many things we need to worry about. In my last article, Planning for Disaster, I covered the importance of a pessimistic mindset when devising a disaster recovery plan. In this one, I'll tackle perhaps the biggest headache for a DBA: a slow and unreliable storage system.
I’ll cover a number of tactics and methods employed by experienced DBAs to ensure that their storage systems perform adequately, and are capable of handling peak load requirements. The next article will look at storage reliability and integrity.
Performance-Centric vs. Capacity-Centric Designs
It’s very common to hear customers complain about the quoted cost and delivery time of enterprise storage. How much!? I can walk down to Best Buy and purchase a Terabyte drive for $100 and have it ready in 30 minutes! Of course, it’s more complicated than that. As DBAs we need to consider backups, performance, manageability and numerous other factors, mostly absent in the consumer space. That being said, it still surprises me how often a capacity-centric storage design approach is used by DBAs today.
Consider a request for a new 500GB database. How many hard disks do we need to fulfil that request? A capacity-centric approach would use a single 640GB disk. Nice and easy, except we have no idea how the system will perform, let alone issues surrounding backups and fault tolerance. Using a performance-centric approach, we’ll look at this in a completely different manner. Before we even consider the size requirements, we look at the throughput requirements and calculate the number of disks using simple division based on the throughput of each disk. More on that shortly...
A fundamental performance tuning task is to identify the biggest bottleneck. Once located, the bottleneck can be removed, or at least alleviated, in order to increase throughput to an acceptable level. While it’s true that the biggest performance gains can often be realized through good application design, running on sub-optimal hardware makes the performance tuning process much more difficult. Hard disks, as the slowest hardware component, deserve special consideration when optimizing the hardware platform on which SQL Server runs.
Hard Disks and Seek Time
Conventional hard disks are mechanical components; that is, they’re comprised of moving parts. Circular platters spin and disk arms move in and out from the centre of the platters, thus enabling all parts of the disk to be read. The time it takes to physically move the disk heads (on the end of the disk arm) to the correct position is known as the seek time. The seek time in combination with other measurements such as rotational time and transfer time, determines the overall speed of the drive.
The seek time for any given disk is determined by how far away from the required data the disk heads are at the time of the read/write request. In typical OnLine Transaction Processing (OLTP) databases, the read/write activity is largely random; each transaction will likely require data from a different part of the disk. It follows that in most OLTP applications, hard disks will spend most of their time seeking data, therefore, the seek time of the disk is a crucial bottleneck.
There are limits to how fast a disk can spin and how fast disk heads can move and, arguably, we reached that limit many years ago. What should be obvious by now is that in order to overcome these physical limits, we need to use multiple disks, therefore reducing the effects of seek/spin time. Enter RAID. In addition to providing protection against data loss through the redundant storage of data on multiple disks, RAID partitions stripe data across many disks. This increases performance by avoiding the bottlenecks of a single disk. The next question is how many disks do we need?
How many Disks?
In answering this question we need to consider two important measurements. Firstly, the number of reads and writes we expect our database to perform per second, and secondly, the IO per second (IOPS) rating of each disk. Expressed as a formula, we have:
Required # Disks = (Reads/sec + (Writes/sec * RAID adjuster)) / Disk IOPS
You’ll note that we’ve multiplied writes/sec by “RAID Adjuster”. RAID disk systems write the same piece of data multiple times in order to provide fault tolerance against disk failure. The RAID level chosen determines the multiplier factor. For example, a RAID 1 (or 10) array duplicates each write and so the RAID multiplier is 2. RAID 5, discussed later in this article, has a higher write overhead in maintaining parity across the disks, and therefore has a higher multiplier value.
The next question is: how many reads and writes per second does the database perform? There are a few ways to answer that question. For existing production systems, we can measure it using the Windows Performance Monitor tool. For systems in development, we can measure the reads/writes per transaction, and then multiply this value by the peak number of transactions per second that we expect the system to endure. I emphasize peak transactions, because it’s important to plan for maximum expected load.
Say, for example, we’re designing a RAID 10 system. The individual disks are capable of 150 IOPS and the system as a whole is expected to handle 1200 reads per second and 400 writes per second. We can calculate the number of disks required as follows:
Required # Disks = (1200 + (400 * 2)) / 150 = ~13 DISKS
What about capacity? We haven’t even considered that yet, and deliberately so. Assuming we’re using 72GB disks, 13 of them will give us a capacity of about 936GB, or just over 450GB usable space after allowing for duplicate writes under RAID. If our estimated database size is only 100GB, then we have ~ 22% utilization to ensure we can meet peak throughput demand. In summary, this is the performance-centric approach.
Determining the number of disks required is only one (albeit very important) aspect of designing a storage system. The other crucial design point is ensuring that the I/O bus is capable of handling the IO throughput. The more disks we have, the higher the potential throughput requirements and we need to be careful to ensure we don’t saturate the IO bus.