|
DBA
> Job Interview Questions > Sybase Interview Questions and Answers
More DBA job interview questions and answers at
http://dba.fyicenter.com/Interview-Questions/
(Continued from previous question...)
Sybase ASE Performance and Tuning
All Components Affect Response Time & Throughput
We often think that high performance is defined as a fast data server, but the picture
is not that simple. Performance is determined by all these factors:
- The client application itself:
- How efficiently is it written?
- We will return to this later, when we look at application tuning.
- The client-side library:
- What facilities does it make available to the application?
- How easy are they to use?
- The network:
- How efficiently is it used by the client/server connection?
- The DBMS:
- How effectively can it use the hardware?
- What facilities does it supply to help build efficient fast applications?
- The size of the database:
- How long does it take to dump the database?
- How long to recreate it after a media failure?
Unlike some products which aim at performance on paper, Sybase aims at solving the
multi-dimensional problem of delivering high performance for real applications.
OBJECTIVES
To gain an overview of important considerations and alternatives for the design,
development, and implementation of high performance systems in the Sybase client/server
environment. The issues we will address are:
- Client Application and API Issues
- Physical Database Design Issues
- Networking Issues
- Operating System Configuration Issues
- Hardware Configuration Issues
- ASE Configuration Issues
Client Application and Physical Database Design design decisions will account for
over 80% of your system's "tuneable" performance so ... plan your project
resources accordingly !
It is highly recommended that every project include individuals who have taken Sybase
Education's Performance and Tuning course. This 5-day course provides the hands-on
experience essential for success.
Client Application Issues
- Tuning Transact-SQL Queries
- Locking and Concurrency
- ANSI Changes Affecting Concurrency
- Application Deadlocking
- Optimizing Cursors in v10
- Special Issues for Batch Applications
- Asynchronous Queries
- Generating Sequential Numbers
- Other Application Issues
Tuning Transact-SQL Queries
- Learn the Strengths and Weaknesses of the Optimizer
- One of the largest factors determining performance is TSQL! Test not only for efficient
plans but also semantic correctness.
- Optimizer will cost every permutation of accesses for queries involving 4 tables or
less. Joins of more than 4 tables are "planned" 4-tables at a time (as listed in
the FROM clause) so not all permutations are evaluated. You can influence the plans for
these large joins by the order of tables in the FROM clause.
- Avoid the following, if possible:
- Use Tools to Evaluate and Tune Important/Problem Queries
- Use the "set showplan on" command to see the plan chosen as "most
efficient" by optimizer. Run all queries through during development and testing to
ensure accurate access model and known performance. Information comes through the Error
Handler of a DB-Library application.
- Use the "dbcc traceon(3604, 302, 310)" command to see each alternative plan
evaluated by the optimizer. Generally, this is only necessary to understand why the
optimizer won't give you the plan you want or need (or think you need)!
- Use the "set statistics io on" command to see the number of logical and
physical i/o's for a query. Scrutinize those queries with high logical i/o's.
- Use the "set statistics time on" command to see the amount of time (elapsed,
execution, parse and compile) a query takes to run.
- If the optimizer turns out to be a "pessimizer", use the "set forceplan
on" command to change join order to be the order of the tables in the FROM clause.
- If the optimizer refuses to select the proper index for a table, you can force it by
adding the index id in parentheses after the table name in the FROM clause.
SELECT * FROM orders(2), order_detail(1) WHERE ...
This may cause portability issues should index id's vary/change by site !
Locking and Concurrency
- The Optimizer Decides on Lock Type and Granularity
- Decisions on lock type (share, exclusive, or update) and granularity (page or table) are
made during optimization so make sure your updates and deletes don't scan the table !
- Exclusive Locks are Only Released Upon Commit or Rollback
- Lock Contention can have a large impact on both throughput and response time if not
considered both in the application and database design !
- Keep transactions as small and short as possible to minimize blocking. Consider
alternatives to "mass" updates and deletes such as a v10.0 cursor in a stored
procedure which frequently commits.
- Never include any "user interaction" in the middle of transactions.
- Shared Locks Generally Released After Page is Read
- Share locks "roll" through result set for concurrency. Only
"HOLDLOCK" or "Isolation Level 3" retain share locks until commit or
rollback. Remember also that HOLDLOCK is for read-consistency. It doesn't block other
readers !
- Use optimistic locking techniques such as timestamps and the tsequal() function to check
for updates to a row since it was read (rather than holdlock)
ANSI Changes Affecting Concurrency
- Chained Transactions Risk Concurrency if Behavior not Understood
- Sybase defaults each DML statement to its own transaction if not specified ;
- ANSI automatically begins a transaction with any SELECT, FETCH, OPEN, INSERT, UPDATE, or
DELETE statement ;
- If Chained Transaction must be used, extreme care must be taken to ensure locks aren't
left held by applications unaware they are within a transaction! This is especially
crucial if running at Level 3 Isolation
- Lock at the Level of Isolation Required by the Query
- Read Consistency is NOT a requirement of every query.
- Choose level 3 only when the business model requires it
- Running at Level 1 but selectively applying HOLDLOCKs as needed is safest
- If you must run at Level 3, use the NOHOLDLOCK clause when you can !
- Beware of (and test) ANSI-compliant third-party applications for concurrency
Application Deadlocking
Prior to ASE 10 cursors, many developers simulated cursors by using two or more
connections (dbproc's) and divided the processing between them. Often, this meant one
connection had a SELECT open while "positioned" UPDATEs and DELETEs were issued
on the other connection. The approach inevitably leads to the following problem:
- Connection A holds a share lock on page X (remember "Rows Pending" on SQL
Server leave a share lock on the "current" page).
- Connection B requests an exclusive lock on the same page X and waits...
- The APPLICATION waits for connection B to succeed before invoking whatever logic will
remove the share lock (perhaps dbnextrow). Of course, that never happens ...
Since Connection A never requests a lock which Connection B holds, this is NOT a true
server-side deadlock. It's really an "application" deadlock !
Design Alternatives
- Buffer additional rows in the client that are "nonupdateable". This forces the
shared lock onto a page on which the application will not request an exclusive lock.
- Re-code these modules with CT-Library cursors (aka. server-side cursors). These cursors
avoid this problem by disassociating command structures from connection structures.
- Re-code these modules with DB-Library cursors (aka. client-side cursors). These cursors
avoid this problem through buffering techniques and re-issuing of SELECTs. Because of the
re-issuing of SELECTs, these cursors are not recommended for high transaction sites !
Optimizing Cursors with v10.0
Special Issues for Batch Applications
ASE was not designed as a batch subsystem! It was designed as an RBDMS for large
multi-user applications. Designers of batch-oriented applications should consider the
following design alternatives to maximize performance :
Design Alternatives :
- Minimize Client/Server Interaction Whenever Possible
- Don't turn ASE into a "file system" by issuing single table / single
row requests when, in actuality, set logic applies.
- Maximize TDS packet size for efficient Interprocess Communication (v10 only)
- New ASE 10.0 cursors declared and processed entirely within stored procedures and
triggers offer significant performance gains in batch processing.
- Investigate Opportunities to Parallelize Processing
- Breaking up single processes into multiple, concurrently executing, connections (where
possible) will outperform single streamed processes everytime.
- Make Use of TEMPDB for Intermediate Storage of Useful Data
Asynchronous Queries
Many, if not most, applications and 3rd Party tools are coded to send queries with the
DB-Library call dbsqlexec( ) which is a synchronous call ! It sends a query and then waits
for a response from ASE that the query has completed !
Designing your applications for asynchronous queries provides many benefits:
- A "Cooperative" multi-tasking application design under Windows will allow
users to run other Windows applications while your long queries are processed !
- Provides design opportunities to parallize work across multiple ASE connections.
Implementation Choices:
- System 10 Client Library Applications:
- True asynchronous behaviour is built into the entire library. Through the appropriate
use of call-backs, asynchronous behavior is the normal processing paradigm.
- Windows DB-Library Applications (not true async but polling for data):
- Use dbsqlsend(), dbsqlok(), and dbdataready() in conjunction with some additional code
in WinMain() to pass control to a background process. Code samples which outline two
different Windows programming approaches (a PeekMessage loop and a Windows Timer approach)
are available in the Microsoft Software Library on Compuserve (GO MSL). Look for SQLBKGD.ZIP
- Non-PC DB-Library Applications (not true async but polling for data):
- Use dbsqlsend(), dbsqlok(), and dbpoll() to utilize non-blocking functions.
Generating Sequential Numbers Many applications use unique sequentially increasing
numbers, often as primary keys. While there are good benefits to this approach, generating
these keys can be a serious contention point if not careful. For a complete discussion of
the alternatives, download Malcolm Colton's White Paper on Sequential Keys from the SQL
Server Library of our OpenLine forum on Compuserve.
The two best alternatives are outlined below.
- "Primary Key" Table Storing Last Key Assigned
- IDENTITY Columns (v10.0 only)
- Last key assigned for each table is stored in memory and automatically included in all
INSERTs (BCP too). This should be the method of choice for performance.
- Choose a large enough numeric or else all inserts will stop once the max is hit.
- Potential rollbacks in long transactions may cause gaps in the sequence !
Other Application Issues
- Transaction Logging Can Bottleneck Some High Transaction Environments
- Committing a Transaction Must Initiate a Physical Write for Recoverability
- Implementing multiple statements as a transaction can assist in these environment by
minimizing the number of log writes (log is flushed to disk on commits).
- Utilizing the Client Machine's Processing Power Balances Load
- Client/Server doesn't dictate that everything be done on Server!
- Consider moving "presentation" related tasks such as string or mathematical
manipulations, sorting, or, in some cases, even aggregating to the client.
- Populating of "Temporary" Tables Should Use "SELECT INTO"
- balance this with dynamic creation of temporary tables in an OLTP environment. Dynamic
creation may cause blocks in your tempdb.
- "SELECT INTO" operations are not logged and thus are significantly faster than
there INSERT with a nested SELECT counterparts.
- Consider Porting Applications to Client Library Over Time
- True Asynchronous Behavior Throughout Library
- Array Binding for SELECTs
- Dynamic SQL
- Support for ClientLib-initiated callback functions
- Support for Server-side Cursors
- Shared Structures with Server Library (Open Server 10)
Physical Database Design Issues
- Normalized -vs- Denormalized Design
- Index Selection
- Promote "Updates-in-Place" Design
- Promote Parallel I/O Opportunities
Normalized -vs- Denormalized
- Always Start with a Completely Normalized Database
- Denormalization should be an optimization taken as a result of a performance problem
- Benefits of a normalized database include :
- Accelerates searching, sorting, and index creation since tables are narrower
- Allows more clustered indexes and hence more flexibility in tuning queries, since there
are more tables ;
- Accelerates index searching since indexes tend to be narrower and perhaps shorter ;
- Allows better use of segments to control physical placement of tables ;
- Fewer indexes per table, helping UPDATE, INSERT, and DELETE performance ;
- Fewer NULLs and less redundant data, increasing compactness of the database ;
- Accelerates trigger execution by minimizing the extra integrity work of maintaining
redundant data.
- Joins are Generally Very Fast Provided Proper Indexes are Available
- Normal caching and cindextrips parameter (discussed in Server section) means each join
will do on average only 1-2 physical I/Os.
- Cost of a logical I/O (get page from cache) only 1-2 milliseconds.
- There Are Some Good Reasons to Denormalize
- All queries require access to the "full" set of joined data.
- Majority of applications scan entire tables doing joins.
- Computational complexity of derived columns require storage for SELECTs
- Others ...
Index Selection
- Without a clustered index, all INSERTs and "out-of-place" UPDATEs go to the
last page. The lock contention in high transaction environments would be prohibitive. This
is also true for INSERTs to a clustered index on a monotonically increasing key.
- High INSERT environments should always cluster on a key which provides the most
"randomness" (to minimize lock / device contention) that is usable in many
queries. Note this is generally not your primary key !
- Prime candidates for clustered index (in addition to the above) include :
- Columns Accessed by a Range
- Columns Used with Order By, Group By, or Joins
- Indexes Help SELECTs and Hurt INSERTs
- Too many indexes can significantly hurt performance of INSERTs and
"out-of-place" UPDATEs.
- Prime candidates for nonclustered indexes include :
- Columns Used in Queries Requiring Index Coverage
- Columns Used to Access Less than 20% (rule of thumb) of the Data.
- Unique indexes should be defined as UNIQUE to help the optimizer
- Minimize index page splits with Fillfactor (helps concurrency and minimizes deadlocks)
- Keep the Size of the Key as Small as Possible
- Accelerates index scans and tree traversals
- Use small datatypes whenever possible . Numerics should also be used whenever possible
as they compare faster than strings.
Promote "Update-in-Place" Design
- "Update-in-Place" Faster by Orders of Magnitude
- Performance gain dependent on number of indexes. Recent benchmark (160 byte rows, 1
clustered index and 2 nonclustered) showed 800% difference!
- Alternative ("Out-of-Place" Update) implemented as a physical DELETE followed
by a physical INSERT. These tactics result in:
- Increased Lock Contention
- Increased Chance of Deadlock
- Decreased Response Time and Throughput
- Currently (System 10 and below), Rules for "Update-in-Place" Behavior Include
:
- Columns updated can not be variable length or allow nulls
- Columns updated can not be part of an index used to locate the row to update
- No update trigger on table being updated (because the inserted and deleted tables used
in triggers get their data from the log)
In v4.9.x and below, only one row may be affected and the optimizer must know this in
advance by choosing a UNIQUE index. System 10 eliminated this limitation.
Promote Parallel I/O Opportunities
Networking Issues
- Choice of Transport Stacks
- Variable Sized TDS Packets
- TCP/IP Packet Batching
Choice of Transport Stacks for PCs
- Choose a Stack that Supports "Attention Signals" (aka. "Out of Band
Data")
- Provides for the most efficient mechanism to cancel queries.
- Essential for sites providing ad-hoc query access to large databases.
- Without "Attention Signal" capabilities (or the urgent flag in the connection
string), the DB-Library functions DBCANQUERY ( ) and DBCANCEL ( ) will cause ASE to
send all rows back to the Client DB-Library as quickly as possible so as to complete the
query. This can be very expensive if the result set is large and, from the user's
perspective, causes the application to appear as though it has hung.
- With "Attention Signal" capabilities, Net-Library is able to send an
out-of-sequence packet requesting the ASE to physically throw away any remaining
results providing for instantaneous response.
- Currently, the following network vendors and associated protocols support the an
"Attention Signal" capable implementation:
- NetManage NEWT
- FTP TCP
- Named Pipes (10860) - Do not use urgent parameter with this Netlib
- Novell LAN Workplace v4.1 0 Patch required from Novell
- Novell SPX - Implemented internally through an "In-Band" packet
- Wollongong Pathway
- Microsoft TCP - Patch required from Microsoft
Variable-sized TDS Packets
Pre-v4.6 TDS Does Not Optimize Network Performance Current ASE TDS packet size
limited to 512 bytes while network frame sizes are significantly larger (1508 bytes on
Ethernet and 4120 bytes on Token Ring).
The specific protocol may have other limitations!
For example:
- IPX is limited to 576 bytes in a routed network.
- SPX requires acknowledgement of every packet before it will send another. A recent
benchmark measured a 300% performance hit over TCP in "large" data transfers
(small transfers showed no difference).
- Open Client Apps can "Request" a Larger Packet Shown to have significant
performance improvement on "large" data transfers such as BCP, Text / Image
Handling, and Large Result Sets.
- clients:
- isql -Usa -Annnnn
- bcp -Usa -Annnnn
- ct_con_props (connection, CS_SET, CS_PACKETSIZE, &packetsize, sizeof(packetsize),
NULL)
- An "SA" must Configure each Servers' Defaults Properly
- sp_configure "default packet size", nnnnn - Sets default packet size per
client connection (defaults to 512)
- sp_configure "maximum packet size", nnnnn - Sets maximum TDS packet size per
client connection (defaults to 512)
- sp_configure "additional netmem", nnnnn - Additional memory for large packets
taken from separate pool. This memory does not come from the sp_configure memory setting.
Optimal
value = ((# connections using large packets large packetsize * 3) + an additional 1-2% of
the above calculation for overhead)
Each connection using large packets has 3 network buffers: one to read; one to write;
and one overflow.
TCP/IP Packet Batching
Operating System Issues
- Never Let ASE Page Fault
- It is better to configure ASE with less memory and do more physical database I/O
than to page fault. OS page faults are synchronous and stop the entire dataserver engine
until the page fault completes. Since database I/O's are asynchronous, other user tasks
can continue!
- Use Process Affinitying in SMP Environments, if Supported
- Affinitying dataserver engines to specific CPUs minimizes overhead associated with
moving process information (registers, etc) between CPUs. Most implementations will
preference other tasks onto other CPUs as well allowing even more CPU time for dataserver
engines.
- Watch out for OS's which are not fully symmetric. Affinitying dataserver engines onto
CPUs that are heavily used by the OS can seriously degrade performance. Benchmark with
your application to find optimal binding.
- Increase priority of dataserver engines, if supported
- Give ASE the opportunity to do more work. If ASE has nothing to do, it
will voluntarily yield the CPU.
- Watch out for OS's which externalize their async drivers. They need to run too!
- Use of OS Monitors to Verify Resource Usage
- The OS CPU monitors only "know" that an instruction is being executed. With
ASE's own threading and scheduling, it can routinely be 90% idle when the OS thinks
its 90% busy. SQL Monitor shows real CPU usage.
- Look into high disk I/O wait time or I/O queue lengths. These indicate physical
saturation points in the I/O subsystem or poor data distribution.
- Disk Utilization above 50% may be subject to queuing effects which often manifest
themselves as uneven response times.
- Look into high system call counts which may be symptomatic of problems.
- Look into high context switch counts which may also be symptomatic of problems.
- Optimize your kernel for ASE (minimal OS file buffering, adequate network
buffers, appropriate KEEPALIVE values, etc).
- Use OS Monitors and SQL Monitor to Determine Bottlenecks
- Most likely "Non-Application" contention points include:
Resource Where to Look
--------- --------------
CPU Performance SQL Monitor - CPU and Trends
Physical I/O Subsystem OS Monitoring tools - iostat, sar...
Transaction Log SQL Monitor - Device I/O and
Device Hit Rate
on Log Device
ASE Network Polling SQL Monitor - Network and Benchmark
Baselines
Memory SQL Monitor - Data and Cache
Utilization
- Use of Vendor-support Striping such as LVM and RAID
- These technologies provide a very simple and effective mechanism of load balancing I/O
across physical devices and channels.
- Use them provided they support asynchronous I/O and reliable writes.
- These approaches do not eliminate the need for Sybase segments to ensure minimal
contention for internal resources.
- Non-read-only environments should expect performance degradations when using RAID levels
other than level 0. These levels all include fault tolerance where each write requires
additional reads to calculate a "parity" as well as the extra write of the
parity data.
Hardware Configuration Issues
- Number of CPUs
- Use information from SQL Monitor to assess ASE's CPU usage.
- In SMP environments, dedicate at least one CPU for the OS.
- Advantages and scaling of VSA is application-dependent. VSA was architected with large
multi-user systems in mind.
- I/O Subsystem Configuration
- Look into high Disk I/O Wait Times or I/O Queue Lengths. These may indicate physical I/O
saturation points or poor data distribution.
- Disk Utilization above 50% may be subject to queuing effects which often manifest
themselves as uneven response times.
- Logical Volume configurations can impact performance of operations such as create
database, create index, and bcp. To optimize for these operations, create Logical Volumes
such that they start on different channels / disks to ensure I/O is spread across
channels.
- Discuss device and controller throughput with hardware vendors to ensure channel
throughput high enough to drive all devices at maximum rating.
General ASE Tuning
- Changing Values with sp_configure or buildmaster
It is imperative that you only use sp_configure to change those parameters that it
currently maintains because the process of reconfiguring actually recalculates a number of
other buildmaster parameters. Using the Buildmaster utility to change a parameter
"managed" by sp_configure may result in a mis-configured server and cause
adverse performance or even worse ...
- Sizing Procedure Cache
- ASE maintains an MRU-LRU chain of stored procedure query plans. As users execute
sprocs, ASE looks in cache for a query plan to use. However, stored procedure query
plans are currently not re-entrant! If a query plan is available, it is placed on the MRU
and execution begins. If no plan is in memory, or if all copies are in use, a new copy is
read from the sysprocedures table. It is then optimized and put on the MRU for execution.
- Use dbcc memusage to evaluate the size and number of each sproc currently in cache. Use
SQL Monitor's cache statistics to get your average cache hit ratio. Ideally during
production, one would hope to see a high hit ratio to minimize the procedure reads from
disk. Use this information in conjuction with your desired hit ratio to calculate the
amount of memory needed.
- Memory
- Tuning memory is more a price/performance issue than anything else ! The more memory you
have available, the greater than probability of minimizing physical I/O. This is an
important goal though. Not only does physical I/O take significantly longer, but threads
doing physical I/O must go through the scheduler once the I/O completes. This means that
work on behalf of the thread may not actually continue to execute for quite a while !
- There are no longer (as of v4.8) any inherent limitations in ASE which cause a
point of diminishing returns on memory size.
- Calculate Memory based on the following algorithm :
Total Memory = Dataserver Executable Size (in bytes) +
Static Overhead of 1 Mb +
User Connections x 40,960 bytes +
Open Databases x 644 bytes +
Locks x 32 bytes +
Devices x 45,056 bytes +
Procedure Cache +
Data Cache
- Recovery Interval
- As users change data in ASE, only the transaction log is written to disk right
away for recoverability. "Dirty" data and index pages are kept in cache and
written to disk at a later time. This provides two major benefits:
- Many transactions may change a page yet only one physical write is done
- ASE can schedule the physical writes "when appropriate"
- ASE must eventually write these "dirty" pages to disk.
- A checkpoint process wakes up periodically and "walks" the cache chain looking
for dirty pages to write to disk
- The recovery interval controls how often checkpoint writes dirty pages.
- Tuning Recovery Interval
- A low value may cause unnecessary physical I/O lowering throughput of the system.
Automatic recovery is generally much faster during boot-up.
- A high value minimizes unnecessary physical I/O and helps throughput of the system.
Automatic recovery may take substantial time during boot-up.
Audit Performance Tuning for v10.0
- Potentially as Write Intensive as Logging
- Isolate Audit I/O from other components.
- Since auditing nearly always involves sequential writes, RAID Level 0 disk striping or
other byte-level striping technology should provide the best performance (theoretically).
- Size Audit Queue Carefully
- Audit records generated by clients are stored in an in memory audit queue until they can
be processed.
- Tune the queue's size with sp_configure "audit queue size", nnnn (in rows).
- Sizing this queue too small will seriously impact performance since all user processes
who generate audit activity will sleep if the queue fills up.
- Size Audit Database Carefully
- Each audit row could require up to 416 bytes depending on what is audited.
- Sizing this database too small will seriously impact performance since all user
processes who generate audit activity will sleep if the database fills up.
(Continued on next question...)
|
|