| |
|
If this is your first visit, be sure to check out the FAQ by clicking the link above.
You may have to register before you can post: click the register link above to proceed.
To start viewing messages, select the forum that you want to visit from the selection below.
|
 |

11-01-10, 10:54
|
|
Registered User
|
|
Join Date: Sep 2002
Location: Wisconsin
Posts: 37
|
|
|
Sequential view of a large DB2 table?
|
|
I need a way to get a sequential view of a large DB2 table. I have a batch job that runs forever because it uses a flat file to do cursor selects from a large DB2 table. Each record on the flat file can select hundreds of rows from the DB2 table. There are thousands of records in the flat file and millions of records in the DB2 table, so the batch job can run longer than 24 hours.... and it's a daily job.
If I can get a sequential view of this huge DB2 table, I can apply simple sort type matching logic against the flat file and avoid the creation of thousands of cursors. I thought of creating a cursor that selects everything with an ORDER BY clause, but that didn't seem particularly economical.
|
|

11-01-10, 11:31
|
|
Registered User
|
|
Join Date: Feb 2008
Location: Japan
Posts: 2,193
|
|
Create a table corresponding to the flat file.
Load the flat file into the new table.
Then, join the new table and the large DB2 table.
|
|

11-01-10, 13:45
|
|
Registered User
|
|
Join Date: Sep 2002
Location: Wisconsin
Posts: 37
|
|
|
|
That's an interesting idea. But, I still need the flat file and a cursor to select rows from the big DB3 table for daily processing. Creating the join would take forever. I just found out the DB2 table contains 2 billion rows. That adds some unexpected wrinkles to how to handle it.
|
|

11-01-10, 14:13
|
|
:-)
|
|
Join Date: Jun 2003
Location: Toronto, Canada
Posts: 4,449
|
|
Quote:
Originally Posted by Balr14
I still need the flat file and a cursor to select rows from the big DB3 table for daily processing.
|
No you don't.
Quote:
Originally Posted by Balr14
Creating the join would take forever. I just found out the DB2 table contains 2 billion rows. That adds some unexpected wrinkles to how to handle it.
|
What makes you think that you can perform a join over 2 billion rows faster than the database engine?
|
|

11-01-10, 15:38
|
|
Registered User
|
|
Join Date: Oct 2010
Posts: 6
|
|
|
More information would help
Balr14,
Here's something worth looking into
Does the table (2 billion row table) have indexes that match the cursor query? If not, maybe a combination of db2expln and db2advis would help in determining the performance pain points as well as suggest an index that may help .
The 24hr+ processing time seems a bit much, depending on what the process is doing.
Is all the time spent inside DB2 or does the process (non db2 part) account for a significant portion of the time.
Would appreciate more information on what exactly your process does
Alex
|
|

11-01-10, 16:35
|
|
Registered User
|
|
Join Date: Sep 2002
Location: Wisconsin
Posts: 37
|
|
Quote:
Originally Posted by n_i
No you don't.
What makes you think that you can perform a join over 2 billion rows faster than the database engine?
|
If I don't have the sequential input file how do I select rows for processing?
I was merely pointing out that a join of this size is not trivial effort for the database engine.
|
Last edited by Balr14; 11-01-10 at 16:49.
|

11-01-10, 16:45
|
|
Registered User
|
|
Join Date: Sep 2002
Location: Wisconsin
Posts: 37
|
|
Quote:
Originally Posted by Alex JP
Balr14,
Here's something worth looking into
Does the table (2 billion row table) have indexes that match the cursor query? If not, maybe a combination of db2expln and db2advis would help in determining the performance pain points as well as suggest an index that may help .
The 24hr+ processing time seems a bit much, depending on what the process is doing.
Is all the time spent inside DB2 or does the process (non db2 part) account for a significant portion of the time.
Would appreciate more information on what exactly your process does
Alex
|
Selection criteria for satisfying overnight information requests is added to an input flat file. That flat file is passed against a 2 billion row primary table, that may have 1 - 500 rows that match the selection criteria. Each record in the flat file is used to build a cursor. The results of the cursor is used to drive table selects from other tables. The selection criteria is based on contents of 6 columns. The stats show DB2 usage is very high. The granularity of the selection criteria reduces the efficiency of the indexes.
|
|

11-01-10, 17:15
|
|
Registered User
|
|
Join Date: Dec 2007
Location: Richmond, VA
Posts: 782
|
|
As Tonkuma suggested is the reason that global temp tables were created. It was to bring in large flat files to join to other DB2 tables and speed up your processing.
Dave Nance
|
|

11-01-10, 17:23
|
|
Registered User
|
|
Join Date: Sep 2002
Location: Wisconsin
Posts: 37
|
|
Quote:
Originally Posted by dav1mo
As Tonkuma suggested is the reason that global temp tables were created. It was to bring in large flat files to join to other DB2 tables and speed up your processing.
Dave Nance
|
I must be missing something. How do I access this global temp table sequentially? The join process takes forever. How can I speed up processing by doing this every day? I'm certainly willing to explore this route if I understand it better.
|
Last edited by Balr14; 11-01-10 at 17:27.
|

11-01-10, 17:27
|
|
Registered User
|
|
Join Date: Oct 2010
Posts: 6
|
|
Assuming that the joins now are using TABLESCAN to generate the result sets
Quote:
Originally Posted by Balr14
... That flat file is passed against a 2 billion row primary table, that may have 1 - 500 rows that match the selection criteria. Each record in the flat file is used to build a cursor.
|
An index on the columns that are part of the selection criteria on the 2 billion row table will definitely help. (since it is really looking for at most 500 rows out of 2 billion)
Quote:
Originally Posted by Balr14
... The stats show DB2 usage is very high.
|
Again this makes sense if DB2 is doing a table scan for every one of the cursors.
One more thing you could try after the indexes is to break your flat file into 2 or more parts and run multiple identical processes in parallel.
And from reading your other posts, your company really ought to think about hiring a DBA
just my 2c
Alex
|
|

11-01-10, 17:48
|
|
Registered User
|
|
Join Date: Sep 2002
Location: Wisconsin
Posts: 37
|
|
Quote:
Originally Posted by Alex JP
Assuming that the joins now are using TABLESCAN to generate the result sets
An index on the columns that are part of the selection criteria on the 2 billion row table will definitely help. (since it is really looking for at most 500 rows out of 2 billion)
Again this makes sense if DB2 is doing a table scan for every one of the cursors.
One more thing you could try after the indexes is to break your flat file into 2 or more parts and run multiple identical processes in parallel.
And from reading your other posts, your company really ought to think about hiring a DBA
just my 2c
Alex
|
I'm working on getting permission to get a DBA to help out. The problem is this is a new direction for us and I don't know how long it will last. The current projects I have make extensive use of DB2. But, once these are done in another month or two, I might not see anything requiring this level of DB2 interaction for another year.
|
|

11-01-10, 18:18
|
|
:-)
|
|
Join Date: Jun 2003
Location: Toronto, Canada
Posts: 4,449
|
|
Quote:
Originally Posted by Balr14
How do I access this global temp table sequentially?
|
You don't need to access the temporary table (or any other table, for that matter) sequentially. Database engines are optimized for set operations, such as
Code:
merge into
targettbl tgt
using
temptable tmp
on
tgt.id = tmp.id
when matched and othercriterium = 'YES'
update
set tgt.something = tmp.somethingelse + 1
as opposed to sequential (loop) processing.
|
|

11-01-10, 18:57
|
|
Registered User
|
|
Join Date: Sep 2002
Location: Wisconsin
Posts: 37
|
|
Quote:
Originally Posted by n_i
You don't need to access the temporary table (or any other table, for that matter) sequentially. Database engines are optimized for set operations, such as
Code:
merge into
targettbl tgt
using
temptable tmp
on
tgt.id = tmp.id
when matched and othercriterium = 'YES'
update
set tgt.something = tmp.somethingelse + 1
as opposed to sequential (loop) processing.
|
So I can perform my other processing on a matched condition? But, I don't really have anything to merge or update in the 2B row table. The flat file is used for selection of data from the 2B row table, to be used for processing other tables and report generation.
|
|

11-01-10, 20:35
|
|
:-)
|
|
Join Date: Jun 2003
Location: Toronto, Canada
Posts: 4,449
|
|
Quote:
Originally Posted by Balr14
The flat file is used for selection of data from the 2B row table, to be used for processing other tables
|
Code:
merge into
othertable tgt
using
(select * from
temptable tmp
inner join
twoblnrowstable twob on tmp.id = twob.id) t
on
tgt.id = t.anotherid
when matched and t.othercriterium = 'YES'
update
set tgt.something = t.somethingelse * 3.1415926
Sorry, I cannot give you a more specific example as my telepathic gland does not function well in freezing temperatures.
|
|

11-02-10, 09:52
|
|
Registered User
|
|
Join Date: Sep 2002
Location: Wisconsin
Posts: 37
|
|
I'm not trying to intentionally be vague or misleading. What I am working on is a "proof of concept" demo for a client's database. I have to demonstrate I can speed up their daily processing without touching their DB2 environment. They have been living with this problem for over a year and have tweaked it many times and have tried many methods. It's also written entirely in Spanish and there are few people in the client's IT area that speak English. So, communications are difficult.
I thank you all for the suggestions and I will keep them in mind for some other projects I have to deal with. You are certainly helping me see a side of DB2 I don't usually deal with. It's quite an education and a real change from the internals. I have a way to handle this problem that gives about a 30% improvement. I just wanted to make sure there wasn't a technique I had overlooked.
|
Last edited by Balr14; 11-02-10 at 09:55.
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|