If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > DB2 > SQL Optimization - Multiple element

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 04-11-07, 01:13
diwakar123 diwakar123 is offline
Registered User
 
Join Date: Jan 2007
Posts: 33
SQL Optimization - Multiple element

Hi

We have following query which insert


-- Element=ADD1

INSERT INTO STAGE.CLIENT_DATA
(SELECT C.ADDRESSID, NEXTVAL FOR STAGE.ID_SEQ,'ADD1', C1.RA_STNBR
FROM STAGE.CLIENT C1,STAGE.CLIENT_ADDRESSDATA1 C
WHERE
C1.RID=C.RID AND NOT EXISTS (SELECT * FROM DB2INST3.ELEMENT A WHERE A.ADDRESSID=C.ADDRESSID
AND A.ELEMENTTYPE='ADD1'));

INSERT INTO STAGE.CLIENT_DATA
(SELECT D.ADDRESSID, D.ID,'ADD1',C1.RA_STNBR
FROM STAGE.CLIENT_ADDRESSDATA1 C, DB2INST3.ELEMENT D,STAGE.CLIENT C1
WHERE C.ADDRESSID=D.ADDRESSID AND D.ELEMENTTYPE='ADD1' AND C1.RID=C.RID);

PS: First query to update existing records, second to insert new record.

The above queries will be repeated for each address element such as ADD2, ADD3, CITY, STATE,ZIP etc....

The # of element= 2 times the query for single table.

If table is having huge data then we are runnning the same query multiple times, which will take long time and definately is not advisable solution.. Is there any alternative in DB2 UDB to write single query for insert /update which takes care of all address element as listed above. We are using db2 udb 8.2 for linux.

Thanks
Diwakar
Reply With Quote
  #2 (permalink)  
Old 04-11-07, 07:53
stolze stolze is offline
Registered User
 
Join Date: Jan 2007
Location: Jena, Germany
Posts: 2,662
The first statement won't update anything. It is an INSERT statement and, thus, can only insert new rows. If you need to perform an UPDATE and INSERT statement together, then have a look at the MERGE statement.

If I got this right, you have only different behavior for the INSERT operation depending on the existence of rows in the ELEMENT table, right? I'd say that a regular LEFT OUTER JOIN should do what you want:
Code:
INSERT
INTO   stage.client_data
SELECT ca.addressid,
       coalesce(e.id, NEXTVAL FOR stage.id_seq),
       'ADD1',
       c.ra_stnbr
FROM   stage.client AS c JOIN
       stage.client_addressdata1 AS ca ON ( c.rid = ca.rid ) LEFT OUTER JOIN
       ( SELECT id
         FROM   db2inst3.element
         WHERE  elementtype = 'ADD1 ) AS e ON ( ca.addressid = e.addressid )
__________________
Knut Stolze
IBM DB2 Analytics Accelerator
IBM Germany Research & Development
Reply With Quote
  #3 (permalink)  
Old 04-11-07, 09:27
diwakar123 diwakar123 is offline
Registered User
 
Join Date: Jan 2007
Posts: 33
Knut,

Thanks for quick response.... I am using MERGE for actual prod table.. This is stage table and used as intermediate table between denormalized table and actual prod table.

So, if I have multiple address element, then do I need to repeat the same query that many times (# of address elements)?

How do I take care multiple adress element within single query?

e.g. 'ADD2',
'ADD3' and so on.

Thnaks
Diwakar
Reply With Quote
  #4 (permalink)  
Old 04-11-07, 13:20
stolze stolze is offline
Registered User
 
Join Date: Jan 2007
Location: Jena, Germany
Posts: 2,662
What are the semantics of the ADDx values? When will you use ADD1, ADD2, and so on? Are you always querying the same 3 tables with the same conditions (just that the ADDx varies)?
__________________
Knut Stolze
IBM DB2 Analytics Accelerator
IBM Germany Research & Development
Reply With Quote
  #5 (permalink)  
Old 04-11-07, 14:57
diwakar123 diwakar123 is offline
Registered User
 
Join Date: Jan 2007
Posts: 33
Knut,

Actually requirement is to polulate all address element eg.

ADD1 = 4560 E
ADD2 = NW ?
CITY= Avon
STATE=NJ
COUTNY ?? and so on..

All records will have these fields in address and we would need to polulate above table with all address element. I had originally provided only 2 SQLs for ADD1, but I am having 12 SQLs for 6 address element.But after LEFT OUTER JOIN the SQLs will still be 6 for 6 address element..

Is there anyway to combine all 6 together w/o affecting the data.

Thanks
Diwakar
Reply With Quote
  #6 (permalink)  
Old 04-11-07, 14:58
diwakar123 diwakar123 is offline
Registered User
 
Join Date: Jan 2007
Posts: 33
Yes, ADDx will change and actual column name [for values of source table].

otherwise query will remain same for all 6 address element.

Thanks
Diwakar
Reply With Quote
  #7 (permalink)  
Old 04-12-07, 06:11
stolze stolze is offline
Registered User
 
Join Date: Jan 2007
Location: Jena, Germany
Posts: 2,662
Could you provide an example of 2 SQL statements, i.e. two ADDx's? I'm still having trouble grasping what you actually want to achieve.

p.s: My general rule is that you can combine many, many things in a single SQL statement - if you know how. So my answer would be that we should find something to handle all 6 ADDx parts in a single statement (and hopefully with a single table scan only).
__________________
Knut Stolze
IBM DB2 Analytics Accelerator
IBM Germany Research & Development
Reply With Quote
  #8 (permalink)  
Old 04-12-07, 11:14
diwakar123 diwakar123 is offline
Registered User
 
Join Date: Jan 2007
Posts: 33
Knut,

Here are 2 out of 6 address lement SQLs.
--ADD1
INSERT INTO STAGE.CLIENT_DATA
(SELECT C.ADDRESSID, NEXTVAL FOR STAGE.ID_SEQ,'ADD1', C1.RA_STNBR
FROM STAGE.CLIENT C1,STAGE.CLIENT_ADDRESSDATA1 C
WHERE
C1.RID=C.RID AND NOT EXISTS (SELECT * FROM DB2INST3.ELEMENT A WHERE A.ADDRESSID=C.ADDRESSID
AND A.ELEMENTTYPE='ADD1'));

INSERT INTO STAGE.CLIENT_DATA
(SELECT D.ADDRESSID, D.ID,'ADD1',C1.RA_STNBR
FROM STAGE.CLIENT_ADDRESSDATA1 C, DB2INST3.ELEMENT D,STAGE.CLIENT C1
WHERE C.ADDRESSID=D.ADDRESSID AND D.ELEMENTTYPE='ADD1' AND C1.RID=C.RID);

--ADD2

INSERT INTO STAGE.CLIENT_DATA
(SELECT C.ADDRESSID, NEXTVAL FOR STAGE.ID_SEQ,'ADD2', C1.RA_DIR
FROM STAGE.CLIENT C1,STAGE.CLIENT_ADDRESSDATA1 C
WHERE
C1.RID=C.RID AND NOT EXISTS (SELECT * FROM DB2INST3.ELEMENT A WHERE A.ADDRESSID=C.ADDRESSID
AND A.ELEMENTTYPE='ADD2'));

INSERT INTO STAGE.CLIENT_DATA
(SELECT D.ADDRESSID, D.ID,'ADD1',C1.RA_DIR
FROM STAGE.CLIENT_ADDRESSDATA1 C, DB2INST3.ELEMENT D,STAGE.CLIENT C1
WHERE C.ADDRESSID=D.ADDRESSID AND D.ELEMENTTYPE='ADD2' AND C1.RID=C.RID);

Similarly it will CITY, STATE, ZIP, COUNTY and more. After using LEFT outer join, two SQLs are combined but still there will be 6 SQLs [1 per address element].

Thanks
Diwakar
Reply With Quote
  #9 (permalink)  
Old 04-12-07, 15:41
stolze stolze is offline
Registered User
 
Join Date: Jan 2007
Location: Jena, Germany
Posts: 2,662
If I got this rigth, then each address element (ADD1, ADD2, CITY, STATE, ...) is stored in a separate row, correct? If so, then you could use a simple UNION operator (or better yet: UNION ALL to avoid sort operations for duplicate elimination):
Code:
INSERT INTO ...
( SELECT ..., 'ADD1', ... FROM ... )
UNION ALL
( SELECT ..., 'ADD2', ... FROM ... )
UNION ALL
( SELECT ..., 'CITY' ... FROM ... )
It is then the DB2 optimizer's job to come up with the best plan for that.

An alternative would be to build this step-by-step to avoid the UNION operatiors and, thus, do this in a single sweep over the underlying tables. The critical piece is that each entry in table STAGE.CLIENT_ADDRESSDATA1 has to fan out to 6 rows (one for each address element). Those 6 elements may or may not exist in the ELEMENTS table.
Code:
INSERT
INTO   stage.client_data
SELECT ca.addressid,
       COALESCE(e.id, NEXTVAL FOR stage.id_seq),
       types.name,
       c.ra_stnbr
FROM   stage.client AS c JOIN
       stage.client_addressdata1 AS ca ON ( c.rid = ca.rid ),
       -- cross join with static table of address elements to get the fan-out
       TABLE ( VALUES ( 'ADD1', 'ADD2', 'CITY', 'STATE', 'ZIP', 'COUNTY' ) ) AS types(name)
       -- now the outer join to add any existing ID information
       LEFT OUTER JOIN
       ( SELECT id, elementtype
         FROM   db2inst3.element ) AS e ON ( ca.addressid = e.addressid AND e.elementtype = types.name)
We introduced a cross join for the fan-out. After the cross join (but before the outer join), each row in CLIENT_ADDRESSDATA1 causes 6 rows to be produced for the final insert, i.e. one row for each address element. The last step is still the left outer join, which determines if there is an element with that type name and for that address already in the ELEMENT table.

I suggest that you try this statement by adding first one join, executing the query, then add the other join and verify the results again - because I'm not sure I got everything correct at the first shot.
__________________
Knut Stolze
IBM DB2 Analytics Accelerator
IBM Germany Research & Development
Reply With Quote
  #10 (permalink)  
Old 04-12-07, 16:06
diwakar123 diwakar123 is offline
Registered User
 
Join Date: Jan 2007
Posts: 33
Knut,

Thanks a lot for quick response ..Yes, each address element will have seperate row...

The column "c.ra_stnbr" in above query will populate the value for ADD1, but since there is sequence generating the values, how can we take care of all address element into one... If yes, how...

Secondly, how to add actual columns like "c.ra_stnbr" for other lement such as c.dir, c.ra_city etc...

thanks
Diwakar
Reply With Quote
  #11 (permalink)  
Old 04-12-07, 16:20
diwakar123 diwakar123 is offline
Registered User
 
Join Date: Jan 2007
Posts: 33
I mean ,there are multiple source column to populate the values for address element. In original SQL we are using:

ra_stnbr to get ADD1
ra_dir=ADD2 and for other element we will have
ra_city=CITY
ra_state=STATE
ra_zip=ZIP and ra_country=COUNTY

Thanks
Diwakar
Reply With Quote
  #12 (permalink)  
Old 04-12-07, 18:22
stolze stolze is offline
Registered User
 
Join Date: Jan 2007
Location: Jena, Germany
Posts: 2,662
Sorry, I missed that part. It's just a trivial issue that can be solved with a CASE expression like this:
Code:
CASE types.name
   WHEN 'ADD1' THEN ra_stnbr
   WHEN 'ADD2' THEN ra_dir
   WHEN 'CITY' THEN ra_city
   WHEN 'STATE' THEN ra_state
   WHEN 'ZIP' THEN ra_zip
   WHEN 'COUNTY' THEN ra_county
END
__________________
Knut Stolze
IBM DB2 Analytics Accelerator
IBM Germany Research & Development
Reply With Quote
  #13 (permalink)  
Old 04-12-07, 21:28
diwakar123 diwakar123 is offline
Registered User
 
Join Date: Jan 2007
Posts: 33
I have updated with all the chanes, but I am getting:

SQL0338N An ON clause associated with a JOIN operator or in a MERGE statement is not valid.

error..

I hv verified syntax but could not figured it out.

Thanks
Diwakar
Reply With Quote
  #14 (permalink)  
Old 04-12-07, 21:46
diwakar123 diwakar123 is offline
Registered User
 
Join Date: Jan 2007
Posts: 33
and when use:
TABLE ( VALUES ( 'ADD1','ADD2'........) ) AS types(name) LEFT OUTER JOIN, it gives error:
SQL0158N The number of columns specified for "TYPES " is not the same as the number of columns in the result table.

but if used only 'ADD1' only it works.
Reply With Quote
  #15 (permalink)  
Old 04-14-07, 09:02
stolze stolze is offline
Registered User
 
Join Date: Jan 2007
Location: Jena, Germany
Posts: 2,662
Try this instead:
Code:
TABLE ( VALUES ('ADD1'), ('ADD2'), ('CITY'), ...) ) AS types(name)
This will produce a table with 6 rows and 1 column instead of a table with 6 columns and 1 rows.

As for the SQL0338, I guess that the problem is that the last ON clause accesses column CA.ADDRESSID and that is over a cross join (instead of an INNER join). Move this predicate to the WHERE clause, i.e. change the last ON clause to:
Code:
ON ( e.elementtype = types.name)
and add this at the very end:
Code:
WHERE ca.addressid = e.addressid
p.s: As long as you agree with my argumentation, I believe that we only have to deal with the syntax issues. I didn't try/verify the statements myself and just typed them...
__________________
Knut Stolze
IBM DB2 Analytics Accelerator
IBM Germany Research & Development
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On