Differences between clustered and non-clustered

DBA > Job Interview Questions > Sybase Interview Questions and Answers

More DBA job interview questions and answers at http://dba.fyicenter.com/Interview-Questions/

(Continued from previous question...)

Differences between clustered and non-clustered in Sybase ?

Clustered Index
A phone book lists everyone by last name. We have an A section, we have a B section and so forth. Within each section my phone book is clever enough to list the starting and ending names for the given page.
The phone book is clustered by last name.
create clustered index on phone_book (last_name)
It's fast to perform the following queries on the phone book:
* Find the address of those whose last name is Cisar.
* Find the address of those whose last name is between Even and Fa

Searches that don't work well:
* Find the address of those whose phone number is 440-1300. * Find the address of those whose prefix is 440 In order to determine the answer to the two above we'd have to search the entire phone book. We can call that a table scan.

Non-Clustered Index
To help us solve the problem above we can build a non-clustered index.
create nonclustered index on phone_book (phone_number)
Our non-clustered index will be built and maintained by our Mythical ASE as follows:
1. Create a data structure that will house a phone_number and information where the phone_number exists in the phone book: page number and the row within the page. The phone numbers will be kept in ascending order.
2. Scan the entire phone book and add an entry to our data structure above for each phone number found.
3. For each phone number found, note along side it the page number that it was located and which row it was in.

any time we insert, update or delete new numbers, our M-ASE will maintain this secondary data structure. It's such a nice Server.
Now when we ask the question:
Find the address of those whose phone number is 440-1300
we don't look at the phone book directly but go to our new data structure and it tells us which page and row within the page the above phone number can be found.
Draw backs? Well, yes. Because we probably still can't answer the question:
Find the address of those whose prefix is 440
This is because of the data structure being used to implement non-clustered indexes. The structure is a list of ordered values (phone numbers) which point to the actual data in the phone book. This indirectness can lead to trouble when a range or a match query is issued.

The structure may look like this:


------------------------------------
|Phone Number   |  Page Number/Row |
====================================
| 440-0000      |  300/23          |
| 440-0001      |  973/45          |
| 440-0002      |   23/2           |
| ...           |                  |
| 440-0030      |  973/45          |
| 440-0031      |  553/23          |
| ...           |                  |
------------------------------------

As one can see, certain phone numbers may map to the same page. This makes sense, but we need to consider one of our constraints: our Server only has room for 200 phone pages.

What may happen is that we re-read the same phone page many times. This isn't a problem if the phone page is in memory. We have limited memory, however, and we may have to flush our memory to make room for other phone pages. So the re-reading may actually be a disk I/O.

The Server needs to decide when it's best to do a table scan versus using the non-clustered index to satisfy mini-range type of queries. The way it decides this is by applying a heuristic based on the information maintained when an update statistics is performed.

In summary, non-clustered indexes work really well when used for highly selective queries and they may work for short, range type of queries.

(Continued on next question...)