Database Indexing Development Lifecycle...Say What?

By: Gregory A. Larsen
To read more DBA articles, visit http://dba.fyicenter.com/article/

This is the first article of a series where I’ll be exploring Indexing Guidelines. Having appropriate indexes on your database is critical in making sure your application is retrieving records quickly. However, having too many indexes can also slow down your application. In this series of articles, I hope to help you understand how to determine what columns you should consider for indexes, and how those columns should be used in indexes to optimize performance. In this first article, I will be discussing the indexing development lifecycle.

What Normally Happens

It has been my experience that designing indexes for a database always seems to be an afterthought. Now I don’t mean all indexes. We all know that when people are putting together their database diagram, at a minimum they do identify the primary keys and the foreign key relationships. But that doesn’t mean they build any physical indexes from these diagrams.

As developers are building their application of course, they add indexes, but they seem to add them when they need them. If they should identify some query is running slow they will look at the execution plan to determine what is happening then they will build an index to speed up the query.

Even after they place their application into production, they might add an index or two. But then again they do this normally only after someone complains about slow performance.

Is this the best approach for developing an efficient indexing strategy for a database? I’d like to say in most cases probably not. Developers and DBAs need to be more proactive when it comes to developing indexes. Moreover, the index development process should not stop once an application gets into production. Developers and DBAs should continue to tweak the indexes long after an application goes into production. I’d like to suggest that managing indexes for a database should go through a development cycle just like you do with your code.

Index Development Lifecycle
The notion of building indexes and forgetting about them should not be used as philosophy when thinking about database indexes. Indexes need to be well thought out and tweaked over time. You need to develop an indexing development lifecycle to build and manage your indexes appropriately over time. I’ll give you some ideas that you can use to establish an indexing development lifecycle for your environment.

In most IT shops, there are at least three different environments: Production, Quality Assurance/Test, and Development. Your T-SQL code migrates through these different environments as your code progresses from one development phase to another, and so should your indexes. Therefore, why not have the following lifecycle phases for developing and maintaining indexes: Design, Development, Acceptance Testing, Production, and Maintenance. Let me go through each one of these phases and discuss the kinds of tasks you should consider performing in each phase.

Design Phase:
The design phase for indexes is just like the design phase for developing code. In this phase, you should look at the data model of your new database and consider the processing requirements your application will need to go through to meet the business rules you have defined. Your programs will need to read data a particular way to build reports. Alternatively, you will have an online screen that will allow users to enter some search criteria, so different screens can be displayed. Play attention to these different data access requirements of the application. If a report needs data to come back in sorted order based on a column value then this column would be a good candidate for being in an index. If customers need to enter an ID and date range to return some customer records on a screen, then the ID and date column associated with the date range would be additional candidate columns for indexes.

In the design phase, you need to get a sense for which columns are being used, and what order those columns will be returned to the application. You can then use this information to design some best guesses at what indexes your application will need. By doing this data analysis you will have the information you need to start identifying some indexes that will most likely be useful for your application.

Development Phase:
In the development phase, you will review how well those best guess indexes are meeting the needs of your application. Just like any other development phase, this is the phase where you will be tweaking those indexes in your design when you realize they are not meeting your application data access requirements.

As you find your code is performing poorly you will add more indexes. Keep in mind indexes don’t come for free. The more indexes you have the more costly an INSERT, UPDATE and DELETE statement will be if it has to update a bunch of indexes. Therefore, create indexes for those queries in your application that are going to be run frequently. If you also have monthly or yearly queries that take a long time, possibly you can do without an index. Possibly you can create these indexes once a month, or year for these monthly/yearly processes. These are the kinds of decisions you need to make when you are developing indexes. You need to have well balanced indexes so your application performs acceptable most of the time for the frequently run queries. You need to make sure you do not have too many indexes that cause the INSERT, UPDATE, and DELETE statements to take a long time.

Once you are done with developing your application code, but prior to moving into user acceptance testing, you need to go back and review the indexes you have. Make sure all the indexes you created during your development phase are the ones that really made your queries run faster and they are not ones you created that provided little value. If you removed those useless indexes as you went along then you probably can omit this step. Another thing to do is verify that you don’t have any duplicate indexes. If you have duplicate indexes you are just wasting resources, both disk space and the cost of maintaining them over time. Here is a script, written by Paul Neilson, that can be used to identify those duplicates:

Full article...