If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > DB2 > Retreiving sets of records in DB2

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 02-14-05, 14:06
dsusendran dsusendran is offline
Registered User
 
Join Date: Apr 2004
Location: Inside Intel
Posts: 165
Retreiving sets of records in DB2

Hi all,

DB2 V8.1 fixpak 6 on Solaris 8 and Win 2k

We have a requirement to retreive a select sets of records, wherein I need
records from a source table as input. It needs the records in
batches of n, where n is configurable (say 1000). It must retrieve every
record in the source table (configurable by a restriction) once and only
once. There must be no predefined stored procedures/functions/views that
refer to the source table specifically (but may using Dynamic SQL)
Preferably, no additional columns would need to be added to the source
table. Also, since there could be transaction log limits, using mass select
into/insert statements is bad, the approach should be cursor based.

(i.e. call a function/SP, get 1000 records moved into the staging table.
Call it again, get the next 1000 records moved into the staging table. And
so on, until all the records have been read into the staging table)

In MS SQL Server, this issue is very easy to solve, as it's possible to have
global cursors that are open from one request to the next on a connection.
This capability does not exist in DB2.

I did finally solve this problem, using the following approach:

1. A global temporary table would be created to house the source data.
2. All the records of interest would be moved from the source table to the
temp table.
3. When I need a batch, I would start a cursor on the temp table.
4. I would navigate the first n records of the cursor, using insert into
selects into a staging table as I want, while simultaneously deleting the
record from the temp table (since it's all dynamic SQL, I had to do all this
with keys)
5. When the next batch was needed, repeat 4 as necessary, until the end of
the temp table is reached.

Now here are the problems with this:
1. If the number of records gets large enough, the size of the temp table
may exceed the available user temp tablespace.
2. Performance is horrendous. Importing from a file is faster.

Now here are two other options which I have thought of.
1. Instead of moving the whole source record, only move the key(s) into the
temp table. Then when navigating the cursor, use the key to pull from the
source table and delete the key from the temp table. (Large values may still
exceed user temporary tablespace though, but it would take much higher
volumes)
2. Add columns to the source table (which I don't want to do) which
includes a batch key, a batch index, and server id. When starting the
process, transverse the source table, setting these values. Then use select
into/inserts to pull into the staging table by batch index (which would be
limited to n records, making it feasible).

I'm looking for more ideas, comments, hints, anything. I'm going to have to
do something, since the performance on the existing method is unacceptable.

Thanks in advance,
Newbie
Reply With Quote
  #2 (permalink)  
Old 02-14-05, 17:24
jacampbell jacampbell is offline
Registered User
 
Join Date: Jan 2005
Posts: 191
"In MS SQL Server, this issue is very easy to solve, as it's possible to have
global cursors that are open from one request to the next on a connection.
This capability does not exist in DB2."

I'm not sure what you mean by "open from one request to the next on a connection". A cursor stays open between requests. Do you mean with commits? Then DECLARE c1 CURSOR WITH HOLD will stay open across commits.

What are you actually doing? How is this application written?

James Campbell
Reply With Quote
  #3 (permalink)  
Old 02-15-05, 12:18
dsusendran dsusendran is offline
Registered User
 
Join Date: Apr 2004
Location: Inside Intel
Posts: 165
I want to define and open a cursor in stored procedure P1(), and then reference it at its current position in a completely separate
stored procedure P2().

Like:

Open Reading P1() (opens a cursor)
Read Batch P2() (reads 1000 records From cursor, moves them to a staging table)
Read Batch P2() (reads next 1000 records From cursor, moves them to a staging table)
...
Close Reading P3() (closes cursor)

Thanks in advance,
Newbie
Reply With Quote
  #4 (permalink)  
Old 02-15-05, 14:05
n_i n_i is offline
:-)
 
Join Date: Jun 2003
Location: Toronto, Canada
Posts: 4,449
Quote:
Originally Posted by dsusendran

Like:

Open Reading P1() (opens a cursor)
Read Batch P2() (reads 1000 records From cursor, moves them to a staging table)
Read Batch P2() (reads next 1000 records From cursor, moves them to a staging table)
...
Close Reading P3() (closes cursor)
Now consider this scenario:

Open Reading P1() (opens a cursor)
Read Batch P2() (reads 1000 records From cursor, moves them to a staging table)
Read Batch P2() (reads next 123 records and crashes)
...

Where do you start your next batch now?

I guess P2() should be just saving the last key value it's processed in a working table (permanent, not even temporary), so the next time it is called it will begin processing the record with the next key value. And that's about it - no open cursors, no running out of temp. space.
Reply With Quote
  #5 (permalink)  
Old 02-15-05, 15:22
dsusendran dsusendran is offline
Registered User
 
Join Date: Apr 2004
Location: Inside Intel
Posts: 165
Thanks n_i,

This is not an issue. Each batch will be committed or rolled back as a whole, so all you need to know is which batches were successfully completed (not individual last-processed keys). During the processing of the records read in a batch,flags are set in the original source table indicating that they were successfully processed (the whole thing is transaction based). So on restart, that first batch does not need to be read again.

Newbie
Reply With Quote
  #6 (permalink)  
Old 02-15-05, 15:48
n_i n_i is offline
:-)
 
Join Date: Jun 2003
Location: Toronto, Canada
Posts: 4,449
Quote:
Originally Posted by dsusendran
During the processing of the records read in a batch,flags are set in the original source table indicating that they were successfully processed (the whole thing is transaction based).
So... is there any reason why your procedure cannot just select the next 1000 records that don't have the flags set?
Reply With Quote
  #7 (permalink)  
Old 02-15-05, 16:20
dsusendran dsusendran is offline
Registered User
 
Join Date: Apr 2004
Location: Inside Intel
Posts: 165
Here is a clearer picture of how i would like the thing to work:

Client App............................................... .....................DB
-----------.................................................. .........------------
Opens Reading..........................->..........................P1() (opens a cursor)
Returns ok, read ready................<-
Reads Batch..............................->..........................P2() (reads next 1000 records from cursor, moves them to a staging table)
Returns ok, records moved...........<-..........................Processes records in staging table
Reads Batch..............................->..........................P2() (reads next 1000 records from cursor, moves them to a staging table)
Returns ok, records moved............<-.........................Processes records in staging table
~~~
Closes Reading..........................->...........................P3() (closes cursor)

What I am referring to is that there appears to be no way to persist an open cursor across multiple stored procedure calls from a remote client application.

Thanks,
Newbie
Reply With Quote
  #8 (permalink)  
Old 02-17-05, 00:20
jacampbell jacampbell is offline
Registered User
 
Join Date: Jan 2005
Posts: 191
I believe that called routines (not called sps) can keep cursors open across multiple calls:

call p('open') << this is NOT an exec sql call p; it is a host language call
do {
call p('fetch')
process cgtt
} while data
call p('close')


p:
if parm = 'open' {
declare, prepare and open cursor
declare global temp table gtt
}
elif parm = 'fetch' {
delete from session.gtt
do i times{
fetch
insert into session.gtt
}
}
else {
close cursor
drop session.gtt
}
return

James Campbell
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On