Failure #1: Failing to plan
Not so long ago, planning seemed an obvious precursor to development. Database teams would no more build databases without careful planning than structural architects would erect buildings without well-considered blueprints. The belief was that a meticulous planning phase was the only way to ensure you didn't end up with a database hacked together with arbitrary components that failed to take into account data integrity and system performance. If you didn't invest the necessary time and effort in the beginning to understand the data, the constraints and the processes, you'd pay off big time with a database that would need to be refactored, replaced or scrapped altogether.
Although such planning wasn't always realized, it emerged as an ideal that many database people still adhere to today. That's not to say they can predict every design need in advance, but they still believe it's worth the effort to understand the data and how it will be used going forward. The last thing anyone wants is a database built for transaction processing when its intent is a data warehouse.
Times have changed, along with notions about how to go about database design. Agile methodologies have taken hold, forcing database teams to rethink their approaches to design and development.
In Agile, the Class-diagrams of Domain Modelling, derived from the Use-Cases, have usually replaced Entity-Relationship modelling; but the need for planning has not diminished. We still need to understand the data and what it's supposed to do and what are the best and safest ways to manage, store, and protect it.
In that respect, it's not the Agile methodologies that present many of the challenges faced by database teams, but rather having to work with individuals who do not understand the full nature of managing, storing, and protecting data. They see database development as comparable to application development, something that can be easily integrated within existing tools, systems, processes, schedules, and workflows, most of which were designed for application development. Although there are practices that database teams can benefit from, such as source control and continuous integration, database design and development are very different disciplines and need to be afforded the respect they deserve.
Ideally, the entire product team will appreciate the important role data plays and how it underpins all their other efforts. We live in the age of data. It drives our businesses and economies and is what keeps us employed. Our applications provide the means to display and manipulate data, but it is the data itself that gives those applications meaning. Yet some treat the database as merely a component of the application, without fully grasping the data's central role, compartmentalizing it in the same way they do dynamic libraries or web services—and careful database design often becomes part of the collateral damage.
Way out of Failure #1
The need for properly planning a database, whether done mostly upfront through a more traditional approach or in stages through the Agile approach, does not go away. In some cases, database teams are put into the position of having to evangelize the importance of proper planning and convincing those around them that taking a piece-meal approach to database design can result in a garbage database and unreliable data, or a lot of data migration planning to address data concerns that were ignored for short-term gains.
A product team, whether or not they've adopted Agile methodologies, must we willing to invest the resources necessary to properly understand the data and have a plan for managing, storing, and protecting it. The plan will change, of course, but those changes must also be as carefully designed and managed. The importance of the data is too great for it to be short-changed. The goal is to have a comprehensive vision of how to move forward that makes sense in relation to the data itself, whether the scope of that vision expands throughout the entire product lifecycle or is limited to the next few sprints.
Teams committed to Agile methodologies acknowledge the importance of modelling the domain, but they might find the planning process to be most challenging in the early stages. Data architects can certainly adjust their approach to design—perhaps not thinking on such grand scales initially—but domain knowledge of the data and how it will be used is still essential, and time and resources must be allocated to ensure that the product team has the understanding it needs in order to provide a proper starting point and move forward in a coherent fashion. If that can happen within the Agile structure, great. If it needs to happen before the first sprint kicks off, that's fine too. But it must happen.
Failure #2: Not bothering with documentation
If a database and the related data environment are not properly documented, it will be difficult to maintain or alter them as business requirements change.
I was tempted to make this the number one failure, not that the failures are necessarily listed in order of priority, but the task of properly documenting a database is so often glossed-over that even discussions around planning and design fail to mention it. I can understand that, though. In the strictest sense, documentation has nothing to do with the design itself, but it has everything thing to do with communicating that design and making it accessible in the future.
But let me take a step back. When I say "documentation," I'm not just talking about a modeling diagram. Sure, that's a good thing to have, but so are data dictionaries, extended properties, data flow diagrams, details about the data sources and how data is transformed, and information about any other component and process that make up a data-driven project. Even the database's object names play an important role (which we'll cover in the next failure).
Unfortunately, proper documentation is often treated as an afterthought, if addressed at all. When schedules and budgets grow tight, documentation gets tossed out the window. But make no mistake, this is technical debt at a very high rate of interest. Those corners being cut now will soon cost an organization over and over again as players try to track changes, identify problems, fix bugs, implement new systems, or in any way make sense of the existing logic and what's happening with the data.
Lack of proper documentation might be a result of shortsighted thinking or simply failing to understand the significant role it plays, but its role cannot be underestimated. It can serve as a valuable resource at ever stage of the product lifecycle for a wide range of individuals, whether database developers, application developers, testers, analysts, managers, DBAs, data architects, new hires, or any other participants who would otherwise spend hours trying to make sense of something that should have been documented in the first place.
Way out of Failure #2
Documentation can, and should, come in many forms. A data dictionary can be a good place to start. It provides descriptions of the tables, columns, relationships, constraints, and other database objects. The data dictionary should be complete and accurate and kept up-to-date as components change. Some teams might prefer to use extended properties, if their RDBMS supports them, or use both, just as long as developers, DBAs, architects and everyone else can know what's going on and have the same understanding of the purpose of each object, whether it's a foreign key or check constraint or even a table. Even details such as why a constraint can't be implemented can be useful.
Modeling diagrams can also be extremely helpful for conveying the database's design to various players. Such diagrams not only show the objects within a database, but also the relationships between them.
These days, data modeling tools make it relatively easy to generate data dictionaries and modeling diagrams and update them as needed, again providing an accurate picture of the database at any given point in time and helping to ensure that everyone has the same understanding of how things work. Perhaps the biggest challenge with this approach is getting the team to work together to provide a consistent way of generating the information. That, of course, will depend on the team.