If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Data Access, Manipulation & Batch Languages > ANSI SQL > SQL Performance Tuning

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 07-12-04, 17:25
MattR MattR is offline
Registered User
 
Join Date: Mar 2001
Location: Lexington, KY
Posts: 606
SQL Performance Tuning

I have four tables in our Oracle 9i DBMS:
sorgeor( sorgeor_code, sorgeor_start_range, sorgeor_end_range )
Cardinality 230,000

spraddr( spraddr_pidm, spraddr_atyp_code, spraddr_status_ind, spraddr_zip, etc. )
Cardinality 2,200,000

aprcatg( aprcatg_pidm, aprcatg_donr_code )
Cardinality 260,000

spbpers( spbpers_pidm, spbpers_dead_ind )
Cardinality 716,000

SORGEOR is a ‘georegion’ table. Basically we define geographical areas and store the zip code ranges in the sorgeor table. For example, I might have a georegion row that looks like this:
sorgeor{ OH_COL, 43080-0000, 43085-9999 }
sorgeor{ OH_COL, 43090-0000, 43091-9999 }

This means that we’ve defined the Columbus, OH metro georegion to include anyone whose zip code is between 43080-000, 43085-9999 or 43090-0000 to 43091-9999.

I would be included in this list because my spraddr row looks like this:
spraddr{ 12345, ‘MA’, NULL, 43081-1234 }

My brother would not be, because his address has expired:
spraddr{ 11111, ‘MA’, ‘INACTIVE’, 43082-3333 }

My neighbor would not be included even though he lives in Columbus because he is deceased:
spraddr{ 1234, ‘MA’, NULL, 43080 }
spbpers{ 1234, ‘DECEASED’ }

My mom would not be included even though she lives in Columbus because she does not have any aprcatg_donr_codes which start with A:
spraddr{ 9999, ‘MA’, NULL, 43091-1111 }
aprcatg{ 9999, ‘PG’ }

My goal is, for particular georegions, get a count of everyone who is still alive, has an ‘A%’ donr_code, and still lives in that georegion (status_ind IS NULL and spraddr_to_date < SYSDATE or spraddr_to_date IS NULL) and is of type MA (spraddr_atyp_code of ‘MA’).

This is what I currently have:
Code:
  SELECT /*+INDEX( SPRADDR SPRADDR_KEY_INDEX ) +INDEX( APRCATG PK_APRCATG ) +ALL_ROWS*/
  	   COUNT( DISTINCT spraddr_pidm )
  	 , sorgeor_geor_code
    FROM sorgeor
   INNER
    JOIN spraddr
  	ON spraddr_pidm IN( SELECT /*+INDEX( APRCATG PK_APRCATG )*/ aprcatg_pidm
 						 FROM aprcatg
 					 WHERE aprcatg_pidm = spraddr_pidm
 						 AND aprcatg_donr_code LIKE 'A%' )
     AND spraddr_pidm IN( SELECT spbpers_pidm
 						 FROM spbpers
 					 WHERE spbpers_pidm = spraddr_pidm
 						 AND spbpers_dead_ind IS NULL )
     AND spraddr_atyp_code = 'MA'
     AND spraddr_status_ind IS NULL
     AND NVL( spraddr_to_date, SYSDATE - 1 ) < SYSDATE
     AND spraddr_zip BETWEEN sorgeor_start_range AND sorgeor_end_range   
  GROUP
  	BY sorgeor_geor_code
Here is the query plan for the first query:
Code:
  0, SELECT STATEMENT
    1,  SORT,  GROUP BY
     2,   TABLE ACCESS,   BY INDEX ROWID,   SORGEOR
  	3,	NESTED LOOPS
  	 4,	 NESTED LOOPS
  	  5,	  NESTED LOOPS
 	 6,	 INDEX,	 FULL SCAN,	 PK_APRCATG
 	 7,	 TABLE ACCESS,	 BY INDEX ROWID,	 SPRADDR
 	 8,	 INDEX,		RANGE SCAN,		SPRADDR_KEY_INDEX
 	 9,	 TABLE ACCESS,	 BY INDEX ROWID,	 SPBPERS
 	 10,	 INDEX,	 UNIQUE SCAN,	 PK_SPBPERS
  	 11,	 INLIST ITERATOR
 	 12,	 INDEX,	 RANGE SCAN,	 SORGEOR_KEY_INDEX2
Here is if I remove the index hints:
Code:
  0, SELECT STATEMENT
    1,  SORT,  GROUP BY
     2,   NESTED LOOPS
  	3,	NESTED LOOPS
  	 4,	 HASH JOIN
 	 5,	 TABLE ACCESS,	 FULL,	 SPRADDR
 	 6,	 TABLE ACCESS,	 FULL,	 APRCATG
  	 7,	 INLIST ITERATOR
 	 8,	 TABLE ACCESS,	 BY INDEX ROWID,	 SORGEOR
 	 9,	 INDEX,	 RANGE SCAN,	 SORGEOR_KEY_INDEX2
 	10,	TABLE ACCESS,	BY INDEX ROWID,	SPBPERS
 	 11,	 INDEX,	 UNIQUE SCAN,	 PK_SPBPERS
If I try this query:
Code:
  SELECT  /*+ALL_ROWS INDEX( SPRADDR SPRADDR_KEY_INDEX )*/  
  	   COUNT( DISTINCT spraddr_pidm ) 
  	 , sorgeor_geor_code 
    FROM sorgeor 
   INNER 
    JOIN spraddr 
  	ON EXISTS( SELECT /*+INDEX( APRCATG PK_APRCATG )*/ *
 						 FROM aprcatg 
 					 WHERE aprcatg_pidm = spraddr_pidm 
 						 AND aprcatg_donr_code LIKE 'A%' ) 
     AND EXISTS ( SELECT /*+INDEX( SPBPERS PK_SPBPERS )*/ *
 						 FROM spbpers 
 					 WHERE spbpers_pidm = spraddr_pidm 
 						 AND spbpers_dead_ind IS NULL ) 
     AND spraddr_atyp_code = 'MA' 
     AND spraddr_status_ind IS NULL 
     AND NVL( spraddr_to_date, SYSDATE - 1 ) < SYSDATE 
     AND spraddr_zip BETWEEN sorgeor_start_range AND sorgeor_end_range  
   GROUP 
  	BY sorgeor_geor_code
I get the following plan:
Code:
  0, SELECT STATEMENT
    1,  SORT,  GROUP BY
     2,   NESTED LOOPS
  	3,	MERGE JOIN
  	 4,	 SORT,	 JOIN
 	 5,	 TABLE ACCESS,	 BY INDEX ROWID,	 SPRADDR
  	   6,	   NESTED LOOPS
 	 7,	 INDEX,		FULL SCAN,		PK_APRCATG
 	 8,	 INDEX,		RANGE SCAN,		SPRADDR_KEY_INDEX
  	 9,	 FILTER
  	  10,	  SORT,	  JOIN
 	 11,	 TABLE ACCESS,	 BY INDEX ROWID,	 SORGEOR
 	 12,	 INDEX,		RANGE SCAN,		SORGEOR_KEY_INDEX2
 	13,	TABLE ACCESS,	BY INDEX ROWID,	SPBPERS
 	 14,	 INDEX,	 UNIQUE SCAN,	 PK_SPBPERS
The third query is unconditionally better than the first two. The problem is that it is still pretty slow. For georegion IREL39369 (371 rows in SORGEOR) it returns 538 people in about four minutes. For georegion IREL39106 (5249 rows) it returns 711 people in about fifty minutes. This is nearly linear. If I extrapolate this linear relationship to the total number of georegion rows I get approximately forty-some odd hours to run for all of them.

Any ideas on how to optimize these queries and/or any query ideas from the gallery? This is a 3rd party application DB and I cannot (without headaches) change the schema which includes adding indexes or creating new tables.
__________________
Thanks,

Matt
Reply With Quote
  #2 (permalink)  
Old 07-12-04, 17:35
MattR MattR is offline
Registered User
 
Join Date: Mar 2001
Location: Lexington, KY
Posts: 606
One note, the 'real' spraddr primary key contains a contrived 'spraddr_seq_no' -- a sequence number. I might be able to have two active MA addresses -- for example if I own two homes. This means that a person can possibly qualify for multiple georegions, which is fine.
__________________
Thanks,

Matt
Reply With Quote
  #3 (permalink)  
Old 07-13-04, 15:34
AceOmega AceOmega is offline
Registered User
 
Join Date: Apr 2004
Location: Arizona
Posts: 49
Specify the fields and only the fields you are going to use in your query unless you are doing something like a direct copy or a insert where you need all of the fields you should try to stear clear of Select *.

Other things you could do is to Prepare the table before opening it and setting the read to dirty read.

Also try to avoid Inner, Left and Right Joins. Use a natural join if you can. I know this is hard because some times you need the row even if the forign key is blank.

Onle last thing you could, which looks like you have already done, is to setup indexes on your keys.




It looks like your speed issue may be in the Inner Join. See if you can do with out the inner join and use a natural join.

Also Try Creating an Index on "sorgeor_geor_code" to speed up your group by.

If this does not work, I have found that creating a view solves this problem if your query does not use any paramters.

I hope this helps.

Last edited by AceOmega; 07-13-04 at 15:43.
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On