DB2 for z/OS provides mainframe users with unmatched levels of resilience and scalability through technology known as data sharing. IBM announced that similar capabilities would be delivered for DB2 for LUW, in an optional facility dubbed pureScale. Read on to learn whether DB2 pureScale has met or exceeded expectations.
Welcome to my last column for 2010. Rather than the traditional look back on the past year, this month I’d like to share some practical experiences for one of the technologies I’ve mentioned several times recently: pureScale.
I’ve covered the major pureScale concepts in previous columns, but here’s a quick refresher.
For many years, DB2 for z/OS has been able to provide mainframe users with unmatched levels of resilience and scalability courtesy of some rather neat technology known as data sharing. This makes use of IBM’s Parallel Sysplex technology to allow many DB2 subsystems (or “members”) to share the same data in a shared-disk architecture. In October 2009, IBM announced that similar capabilities would be delivered for the DB2 for Linux, Unix and Windows product, in an optional facility dubbed pureScale.
As shown in the diagram below, a component known as a CF (aka “coupling facility”, “clustering facility” or more properly “PowerHa pureScale Server”) handles the difficult job of coordinating the updates made by each member to ensure data integrity is maintained. Each member has direct access to the CF via an InfiniBand high-speed network interconnect, minimizing the performance overhead and providing excellent scalability, (IBM has measured near-linear scalability right up to the architectural limit of 128 members).
One of the design goals for pureScale was to minimize the impact to the applications running in the cluster, and although there may be some need to make minor changes to eke out the very best performance, it is perfectly possible for an application to run on a pureScale cluster without making any changes whatsoever. It’s possible to run two CFs in a duplexed arrangement, with DB2 automatically keeping primary and secondary CF in sync. So, with dual CFs and multiple DB2 members all hosted in separate physical boxes and a fault-tolerant disk subsystem, there’s no single point of failure – losing a member, a CF or a physical disk still allows processing to continue (albeit at a potentially slower pace due to each surviving server having to shoulder more of the processing load). This is therefore a true “active/active” clustering solution.
Finally, IBM is introducing an interesting new licensing option with pureScale. Daily Licensing effectively allows a customer to pay for the capacity they are actually using at any given time, rather than having to size (and pay for) a given environment for the peak capacity which may be only needed for a few days per year. This is shown in the two sample diagrams below, with the red areas on the second chart showing the potential capacity/license savings with this model.
I’ve gone on the record stating that pureScale is the single most important development in DB2 for LUW for the past decade, and that means that my organization has been busy building practical experience in the new technology. Here are a few of the more interesting discoveries and technology validations we’ve been making recently (with thanks to my colleague James Gill who has done most of the hard work).