Much of the material that I’ll discuss in this article comes from the excellent work done by Scott W Ambler and Pramodkumar J. Sadalage who wrote the book ‘Refactoring Databases: Evolutionary Database Design’ and maintain the site http://www.agiledata.org/. Just as Martin Fowler’s original book ‘Refactoring: Improving the Design of Existing Code’ introduced the world to the methodology of refactoring code, their book, and website, introduces us to the concepts and disciplines needed to refactor databases.
My aim is to draw attention to their work and shine some light on related topics that are specific to the SQL that we developers use to interact with the database.
Most developers have accepted, and even embraced, the need for a discipline of refactoring application code.By contrast, Refactoring a database is still regarded as an alien concept. I’ll be arguing in this article that refactoring the database is just as important as refactoring code.
The task of refactoring an application, whether it is the database or the application code, is undertaken with the objectives of
* Improving maintainability
* making it more understandable
* Making it easier to make changes
* Making it easier to add new features
* learning the system
Some Background Information
Not all applications are created equal. For some of them, the task of refactoring the database is only marginally more complicated than that of refactoring code. For others, it can be an arduous task that could take more than a year to complete.
The following diagrams illustrate the two extremes.
Some Applications may have a single database and a single application using that database.Other applications may include multiple linked databases, with associated systems for reporting and ETL, serving several applications, not all of which may even be known to the developers. Changes to your database must be coordinated across all of these various systems.
If you work with databases in the complex extreme, you may be tempted to use this complexity as a reason not to refractor the database. It is particularly daunting to face the risk that there may be applications that you don’t know about. This is a true risk, but it is also a symptom that warns that refactoring is probably long overdue. Not knowing what dependent systems may be affected if you make a particular change is not a reason to avoid the change.It is compelling opportunity to learn your system better and better understand and record these dependencies.
I have often found myself in this situation. Many different things may cause it: Key people may have left, taking critical knowledge with them. You may have just discovered a “departmental” solution that has been elevated to the enterprise; you may be starting a new job or working on a new project.
The Basic Process: An Overview
The basic process when you have only a single application and its database is straightforward, really no different from refactoring your code. Test before you make the change. Make a small change.Test after each change. Rerun all tests at each stage.
When there is more than one application involved, things get more complicated. Each affected application may not necessarily be updated on the same release schedule. The system will need to work with some applications using the old structure and some applications using the new structures. In both cases, the end result should be the same from the application perspective.
Part of the refactoring process is ensuring that all applications work properly during the transition period and that all of the structures that are added in order to to support this transition period are subsequently removed once all applications are using only the new structures.