Alternative to row at a time processing in Sybas

DBA > Job Interview Questions > Sybase Interview Questions and Answers

More DBA job interview questions and answers at http://dba.fyicenter.com/Interview-Questions/

(Continued from previous question...)

Alternative to row at a time processing in Sybase ?

Someone asked how they could speed up their processing. They were batch updating/inserting gobs of information. Their algorithm was something as follows:
... In another case I do:
If exists (select record) then
update record
else
insert record

I'm not sure which way is faster or if it makes a difference. I am doing this for as many as 4000 records at a time (calling a stored procedure 4000 times!). I am interesting in knowing any way to improve this. The parameter translation alone on the procedure calls takes 40 seconds for 4000 records. I am using exec in DB-Lib.

Would RPC or CT-Lib be better/faster?

A netter responded stating that it was faster to ditch their algorithm and to apply a set based strategy:
The way to take your approach is to convert the row at a time processing (which is more traditional type of thinking) into a batch at a time (which is more relational type of thinking). Now I'm not trying to insult you to say that you suck or anything like that, we just need to dial you in to think in relational terms.

The idea is to do batches (or bundles) of rows rather than processing a single one at a time.

So let's take your example (since you didn't give exact values [probably out of kindness to save my eyeballs] I'll use your generic example to extend what I'm talking about):

Before:
if exists (select record) then
update record
else
insert record

New way:
1. Load all your rows into a table named new_stuff in a separate work database (call it work_db) and load it using bcp -- no third GL needed.
1. truncate new_stuff and drop all indexes
2. sort your data using UNIX sort and sort it by the clustered columns
3. load it using bcp
4. create clustered index using with sorted_data and any ancillary non-clustered index.
2. Assuming that your target table is called old_stuff
3. Do the update in a single batch:

begin tran

/* delete any rows in old_stuff which would normally
** would have been updated... we'll insert 'em instead!
** Essentially, treat the update as a delete/insert.
*/

delete old_stuff
from old_stuff,
new_stuff
where old_stuff.key = new_stuff.key

/* insert entire new table: this adds any rows
** that would have been updated before and
** inserts the new rows
*/
insert old_stuff
select * from new_stuff

commit tran

(Continued on next question...)