An Oracle Instructor's Guide to DBA Tips and Tricks

By: Christopher Foot
To read more DBA articles, visit http://dba.fyicenter.com/article/

This series of articles is a little different than others you may read. It doesn't focus on one, single topic, but it does provide a wealth of interesting information on a lot of different areas. In fact, this is part one of a three-part series. The first installment includes information on Oracle education and OCP exams and Oracle Internals. Part Two will cover basic database administration techniques, and Part Three will cover tuning and performance and backup and recovery.

Want a listing of the SQL statements currently running in your database and want the most resource consuming ones first? Have you ever wondered how Oracle maintains consistency when you execute the ALTER TABLE MOVE command? What are CHECKPOINT NOT COMPLETE messages and why do they have such a negative impact on performance? When do I use multiple DBWR processes and when do I use DWBR IO slaves? This article will cover these topics and much more.

Oracle Education and OCP Exams

"Can I take the more advanced classes first?"

This is probably not the answer you wanted to hear but the sequence of classes should be followed according to Oracle recommendations. Take the intro classes before taking the more advanced classes.

If you have the luxury (meaning you aren't the only DBA in your shop), gain some day to day experience before taking the more advanced classes (SQL or database tuning, backup and recovery, etc.). You shouldn't be asking questions like "What is an init.ora parameter file, anyway?" in a tuning or backup and recovery class. Instructors don't have the time and your fellow students won't have the patience to bring you up to speed before continuing on to more advanced topics.

If it is an emergency situation, for example your shop's DBA gives two weeks' notice (Oracle DBAs are now considered to be migratory workers by many companies), bring yourself up to speed by doing the following:

Read as much information as you can on the class you are taking before you take it. Oracle's education website, Oracle press books, Oracle's Technet website and third-party consulting companies' web sites contain a wealth of information. You may not know the mechanics, but you do need to know the terms and the concepts being presented in class.
When you attend the classes, notify your instructor that you don't have a lot of day-to-day experience. We want you to get the most out of class, we'll help you by staying later, coming in earlier and giving you additional reading recommendations.
Bring yourself up to speed on the next day's material by reading it the night before. A pet peeve of every instructor are students that don't know the material and don't prepare.

Preparing for the Oracle Certified Professional Exams

The best time to take the exam is a week or two after taking the Oracle class that the exam pertains to. The classes are not required to obtain OCP certification, but they sure help. I have passed every exam I have taken by studying only the information contained in the class workbooks.

Oracle Internals

Instance Recovery and the Oracle Synchronization Process

Oracle instructors often discuss the concept of Oracle data being "out of synch" because the DWBR and LGRW processes act independently. Let's take an in-depth look at how data gets "out of synch" in Oracle and how Oracle gets itself back into shape when it needs to.

There are two memory constructs, two background processes and one storage structure that we need to have a firm understanding of before we continue:

Data buffer cache - memory area used to store blocks read from the data files
Log buffer - memory used to contain before and after images of changed data. Please note the words before and after
DBWR background process - writes dirty buffers from the data buffer cache to the data files
LGWR background process - writes dirty buffers from the redo log buffer to the redo log files
Rollback segments - used to hold before images for transaction recovery, instance recovery and read consistency. Every transaction in Oracle that changes data will write a before image of that data to the rollback segment. If the transaction fails, or the instance fails, the before images stored in the rollback segment are used to put the database back into a transaction consistent state.

When a user process (started by SQL*PLUS, Forms, Reports, OEM, application program, etc.) connects to a database, Oracle starts a server process that performs steps to complete the user process's requests. The server process is responsible for checking the data buffer cache to determine if the data the user process is looking for is contained in the data buffer cache.

If the data is not contained in the data buffer cache, it is the server process's responsibility to read the desired data block from a file on disk and place it in the data buffer cache. If the user process wants to change the data, the server process records a before and after image of the data being changed in the data buffer cache and the redo log buffer cache.

Uncommitted Data on the Data Files

If the user's server process is unable to find a free data buffer, DBWR is notified to flush a section of changed data buffers to their corresponding data files. The section of data buffers being flushed may contain both committed and uncommitted changes.

This could result in Oracle having uncommitted and committed data blocks on the data files. Oracle, for performance reasons, will not check each block in the section being flushed to see if it contains uncommitted data. Oracle wants to write that section as quickly as possible.

A key point is that before this "panic flush" of changed data buffers to disk occurs, DBWR will signal LGWR to write all before images of uncommitted blocks in the section being flushed. This process ensures that all before images of uncommitted data changes can be retrieved from the redo log file during a recovery.

Committed Data Not On the Data Files

When a transaction commits, the server process places a commit record in the log buffer and tells LGWR to perform a contiguous write of all the redo log buffer entries up to and including the commit record to the redo log files (not the data files!).

Oracle is now able to guarantee that the changes will not be lost even if there is an instance failure. Please note that the flushing of dirty data buffers is performed independently by DBWR and can occur either before or after the commit. This could result in Oracle having committed data on the redo log files but not on the data files.

The Synchronization Process

If a failure occurs and the instance terminates abnormally, Oracle must restore the database to a transaction consistent state just prior to the failure. The database must remove all uncommitted data from the data files (because of "panic flushes") and replay all changes committed and recorded in the redo log files but not recorded on the data files (remember that a commit forces LGWR to flush, not DBWR). Oracle restores the database to a transaction consistent state using roll forward and roll backward processes.

Roll Forward Phase

During this phase, DBWR writes both committed and uncommitted data from the redo log files to the data files. The purpose of the roll forward is to apply all changes recorded in the log files to the corresponding data blocks.

Rollback segments are populated during the roll forward phase. A rollback segment entry is added if an uncommitted data block is found on the data files and no rollback entry exists in the rollback segment. At the end of this phase, all committed data is in the data files, although uncommitted data may still exist. The database is then opened to allow user access to database data.

Roll Backward Phase

Oracle removes the uncommitted data by using the rollback segments populated during the roll forward phase or prior to the crash. Blocks are rolled back when requested by the Oracle server or a user's server process, depending on who requests the block first.

The database is therefore available even while the roll backward phase is executing. Only those data blocks that are actively participating in the rollback are not available. Total data synchronization has now occurred.

Latches and Enqueues

What exactly is a latch? Oracle support describes latches as "low level serialization mechanisms used to protect shared data structures in the SGA. The implementation of latches is operating system dependent, particularly in regard to whether a process will wait for a latch and for how long." A latch is designed to be very quickly acquired and freed. Its job is to prevent multiple processes from executing the same piece of code at the same time.

Enqueues, another locking mechanism, allow several processes to share a resource. A read table lock that allows other processes to access the table is a good example of an enqueue. An enqueue allows the process to store the mode in the lock to provide varying degrees of sharing. Enqueues rely upon the operating system's locking mechanism to keep track of the locked resources. If a process has to wait, the operating system will place the requesting process on a wait queue. Latches, on the other hand, use timers or spin-counts (on multiple CPU platforms) to allow the process to "wake up" and try to obtain the latch again at specified intervals.

Index Rebuilds

In releases prior to Oracle8i, the CREATE INDEX and ALTER INDEX REBUILD commands lock the table with a Share Table Lock (S). The Share Table Lock prevents other processes from performing DML and DDL operations on the base table.

Oracle version 8.1 introduced ONLINE as a new option for the REBUILD command. If the ONLINE key word is included as part of the CREATE INDEX or ALTER INDEX…REBUILD commands, the table is locked with a Row Share Table Lock (RS). A Row Share Table Lock allows other processes to perform DML operations but DDL operations are not permitted.

When the ONLINE keyword is specified as a part of an CREATE INDEX or ALTER INDEX....REBUILD command, a temporary journal table is created to record changes made to the base table. The journal is an IOT (Index Organized Table) table type. Oracle merges the changes recorded in the journal at the end of the index build process. This merge process is performed while the table is still online. Oracle may make multiple passes over the journal table to process previously locked rows.

DBWR_IO_SLAVES vs DB_WRITER_PROCESSES

Questions about multiple DBWR processes have plagued DBAs since Oracle7. You configured multiple DBWR process in oracle7 by setting the parameter DB_WRITERS. In Oracle7, multiple DBWR processes were slave processes that were unable to perform asynchronous I/O calls. The algorithm used by the Oracle7 DBWR caused it to incur frequent waits when one buffer in a batch set caused the entire batch to be delayed. This wait was usually caused by a slow disk (I/O) operation. Additionally, this wait also caused delays in scanning and accumulating buffers for the next batch.

The Oracle8i DBWR process corrects this problem. Oracle8i's DBWR continuously writes without waiting for the previously issued calls to complete. As a result, slow writes no longer impact DBWR. This new feature allows DBWR to be inherently asynchronous in nature, regardless of the whether the OS supports asynchronous I/O or not. You configure multiple database writers by setting the parameter DB_WRITER_PROCESSES. This feature was available in Oracle8.0.4 and allows true multiple database writers. There is no master-slave relationship as in Version 7.

If you implement database writer I/O slaves by setting the DBWR_IO_SLAVES parameter, you configure a single (master) DBWR process that has slave processes that are subservient to it. This feature is similar to the DB_WRITERS in Oracle7, except the I/O slaves are now capable of asynchronous I/O on systems that support it. In addition, I/O slaves can be used to "simulate" asynchronous I/O on platforms that do not support asynchronous I/O or implement it inefficiently. Database I/O slaves provide non-blocking, asynchronous requests to simulate asynchronous I/O.

You can't activate both multiple DBWRs and I/O slaves. If both parameters are activated, DBWR_IO_SLAVES will take precedence.

To determine whether to use multiple DBWn processes or database slaves, follow these guidelines:

For write intensive applications that also have a large data buffer cache (100,000 and up), configure DB_WRITER_PROCESSES
For applications that are not write intensive and run on operating systems that support asynchronous I/O, configure DBWR_IO_SLAVES
If the operating system does not support asnychronous I/O, use DBWR_IO_SLAVES
If your server only has one CPU, use DBWR_IO_SLAVES, DBWR processes are CPU intensive