If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > DB2 > Sequential view of a large DB2 table?

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 11-01-10, 10:54
Balr14 Balr14 is offline
Registered User
 
Join Date: Sep 2002
Location: Wisconsin
Posts: 37
Sequential view of a large DB2 table?

I need a way to get a sequential view of a large DB2 table. I have a batch job that runs forever because it uses a flat file to do cursor selects from a large DB2 table. Each record on the flat file can select hundreds of rows from the DB2 table. There are thousands of records in the flat file and millions of records in the DB2 table, so the batch job can run longer than 24 hours.... and it's a daily job.

If I can get a sequential view of this huge DB2 table, I can apply simple sort type matching logic against the flat file and avoid the creation of thousands of cursors. I thought of creating a cursor that selects everything with an ORDER BY clause, but that didn't seem particularly economical.
Reply With Quote
  #2 (permalink)  
Old 11-01-10, 11:31
tonkuma tonkuma is offline
Registered User
 
Join Date: Feb 2008
Location: Japan
Posts: 2,193
Create a table corresponding to the flat file.
Load the flat file into the new table.
Then, join the new table and the large DB2 table.
Reply With Quote
  #3 (permalink)  
Old 11-01-10, 13:45
Balr14 Balr14 is offline
Registered User
 
Join Date: Sep 2002
Location: Wisconsin
Posts: 37
That's an interesting idea. But, I still need the flat file and a cursor to select rows from the big DB3 table for daily processing. Creating the join would take forever. I just found out the DB2 table contains 2 billion rows. That adds some unexpected wrinkles to how to handle it.
Reply With Quote
  #4 (permalink)  
Old 11-01-10, 14:13
n_i n_i is offline
:-)
 
Join Date: Jun 2003
Location: Toronto, Canada
Posts: 4,449
Quote:
Originally Posted by Balr14 View Post
I still need the flat file and a cursor to select rows from the big DB3 table for daily processing.
No you don't.

Quote:
Originally Posted by Balr14 View Post
Creating the join would take forever. I just found out the DB2 table contains 2 billion rows. That adds some unexpected wrinkles to how to handle it.
What makes you think that you can perform a join over 2 billion rows faster than the database engine?
Reply With Quote
  #5 (permalink)  
Old 11-01-10, 15:38
Alex JP Alex JP is offline
Registered User
 
Join Date: Oct 2010
Posts: 6
More information would help

Balr14,

Here's something worth looking into
Does the table (2 billion row table) have indexes that match the cursor query? If not, maybe a combination of db2expln and db2advis would help in determining the performance pain points as well as suggest an index that may help .
The 24hr+ processing time seems a bit much, depending on what the process is doing.
Is all the time spent inside DB2 or does the process (non db2 part) account for a significant portion of the time.

Would appreciate more information on what exactly your process does
Alex
Reply With Quote
  #6 (permalink)  
Old 11-01-10, 16:35
Balr14 Balr14 is offline
Registered User
 
Join Date: Sep 2002
Location: Wisconsin
Posts: 37
Quote:
Originally Posted by n_i View Post
No you don't.


What makes you think that you can perform a join over 2 billion rows faster than the database engine?
If I don't have the sequential input file how do I select rows for processing?

I was merely pointing out that a join of this size is not trivial effort for the database engine.

Last edited by Balr14; 11-01-10 at 16:49.
Reply With Quote
  #7 (permalink)  
Old 11-01-10, 16:45
Balr14 Balr14 is offline
Registered User
 
Join Date: Sep 2002
Location: Wisconsin
Posts: 37
Quote:
Originally Posted by Alex JP View Post
Balr14,

Here's something worth looking into
Does the table (2 billion row table) have indexes that match the cursor query? If not, maybe a combination of db2expln and db2advis would help in determining the performance pain points as well as suggest an index that may help .
The 24hr+ processing time seems a bit much, depending on what the process is doing.
Is all the time spent inside DB2 or does the process (non db2 part) account for a significant portion of the time.

Would appreciate more information on what exactly your process does
Alex

Selection criteria for satisfying overnight information requests is added to an input flat file. That flat file is passed against a 2 billion row primary table, that may have 1 - 500 rows that match the selection criteria. Each record in the flat file is used to build a cursor. The results of the cursor is used to drive table selects from other tables. The selection criteria is based on contents of 6 columns. The stats show DB2 usage is very high. The granularity of the selection criteria reduces the efficiency of the indexes.
Reply With Quote
  #8 (permalink)  
Old 11-01-10, 17:15
dav1mo dav1mo is offline
Registered User
 
Join Date: Dec 2007
Location: Richmond, VA
Posts: 782
As Tonkuma suggested is the reason that global temp tables were created. It was to bring in large flat files to join to other DB2 tables and speed up your processing.
Dave Nance
Reply With Quote
  #9 (permalink)  
Old 11-01-10, 17:23
Balr14 Balr14 is offline
Registered User
 
Join Date: Sep 2002
Location: Wisconsin
Posts: 37
Quote:
Originally Posted by dav1mo View Post
As Tonkuma suggested is the reason that global temp tables were created. It was to bring in large flat files to join to other DB2 tables and speed up your processing.
Dave Nance
I must be missing something. How do I access this global temp table sequentially? The join process takes forever. How can I speed up processing by doing this every day? I'm certainly willing to explore this route if I understand it better.

Last edited by Balr14; 11-01-10 at 17:27.
Reply With Quote
  #10 (permalink)  
Old 11-01-10, 17:27
Alex JP Alex JP is offline
Registered User
 
Join Date: Oct 2010
Posts: 6
Assuming that the joins now are using TABLESCAN to generate the result sets

Quote:
Originally Posted by Balr14 View Post
... That flat file is passed against a 2 billion row primary table, that may have 1 - 500 rows that match the selection criteria. Each record in the flat file is used to build a cursor.
An index on the columns that are part of the selection criteria on the 2 billion row table will definitely help. (since it is really looking for at most 500 rows out of 2 billion)

Quote:
Originally Posted by Balr14 View Post
... The stats show DB2 usage is very high.
Again this makes sense if DB2 is doing a table scan for every one of the cursors.

One more thing you could try after the indexes is to break your flat file into 2 or more parts and run multiple identical processes in parallel.

And from reading your other posts, your company really ought to think about hiring a DBA

just my 2c
Alex
Reply With Quote
  #11 (permalink)  
Old 11-01-10, 17:48
Balr14 Balr14 is offline
Registered User
 
Join Date: Sep 2002
Location: Wisconsin
Posts: 37
Quote:
Originally Posted by Alex JP View Post
Assuming that the joins now are using TABLESCAN to generate the result sets



An index on the columns that are part of the selection criteria on the 2 billion row table will definitely help. (since it is really looking for at most 500 rows out of 2 billion)



Again this makes sense if DB2 is doing a table scan for every one of the cursors.

One more thing you could try after the indexes is to break your flat file into 2 or more parts and run multiple identical processes in parallel.

And from reading your other posts, your company really ought to think about hiring a DBA

just my 2c
Alex
I'm working on getting permission to get a DBA to help out. The problem is this is a new direction for us and I don't know how long it will last. The current projects I have make extensive use of DB2. But, once these are done in another month or two, I might not see anything requiring this level of DB2 interaction for another year.
Reply With Quote
  #12 (permalink)  
Old 11-01-10, 18:18
n_i n_i is offline
:-)
 
Join Date: Jun 2003
Location: Toronto, Canada
Posts: 4,449
Quote:
Originally Posted by Balr14 View Post
How do I access this global temp table sequentially?
You don't need to access the temporary table (or any other table, for that matter) sequentially. Database engines are optimized for set operations, such as
Code:
merge into
 targettbl tgt
using 
 temptable tmp
on
 tgt.id = tmp.id
when matched and othercriterium = 'YES'
update 
 set tgt.something = tmp.somethingelse + 1
as opposed to sequential (loop) processing.
Reply With Quote
  #13 (permalink)  
Old 11-01-10, 18:57
Balr14 Balr14 is offline
Registered User
 
Join Date: Sep 2002
Location: Wisconsin
Posts: 37
Quote:
Originally Posted by n_i View Post
You don't need to access the temporary table (or any other table, for that matter) sequentially. Database engines are optimized for set operations, such as
Code:
merge into
 targettbl tgt
using 
 temptable tmp
on
 tgt.id = tmp.id
when matched and othercriterium = 'YES'
update 
 set tgt.something = tmp.somethingelse + 1
as opposed to sequential (loop) processing.
So I can perform my other processing on a matched condition? But, I don't really have anything to merge or update in the 2B row table. The flat file is used for selection of data from the 2B row table, to be used for processing other tables and report generation.
Reply With Quote
  #14 (permalink)  
Old 11-01-10, 20:35
n_i n_i is offline
:-)
 
Join Date: Jun 2003
Location: Toronto, Canada
Posts: 4,449
Quote:
Originally Posted by Balr14 View Post
The flat file is used for selection of data from the 2B row table, to be used for processing other tables
Code:
merge into
 othertable tgt
using 
 (select * from 
  temptable tmp 
  inner join 
  twoblnrowstable twob on tmp.id = twob.id) t
on
 tgt.id = t.anotherid
when matched and t.othercriterium = 'YES'
update 
 set tgt.something = t.somethingelse * 3.1415926
Sorry, I cannot give you a more specific example as my telepathic gland does not function well in freezing temperatures.
Reply With Quote
  #15 (permalink)  
Old 11-02-10, 09:52
Balr14 Balr14 is offline
Registered User
 
Join Date: Sep 2002
Location: Wisconsin
Posts: 37
I'm not trying to intentionally be vague or misleading. What I am working on is a "proof of concept" demo for a client's database. I have to demonstrate I can speed up their daily processing without touching their DB2 environment. They have been living with this problem for over a year and have tweaked it many times and have tried many methods. It's also written entirely in Spanish and there are few people in the client's IT area that speak English. So, communications are difficult.

I thank you all for the suggestions and I will keep them in mind for some other projects I have to deal with. You are certainly helping me see a side of DB2 I don't usually deal with. It's quite an education and a real change from the internals. I have a way to handle this problem that gives about a 30% improvement. I just wanted to make sure there wasn't a technique I had overlooked.

Last edited by Balr14; 11-02-10 at 09:55.
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On