If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > DB2 > query accessing federated tables

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 01-02-08, 17:58
rajaryan4545 rajaryan4545 is offline
Registered User
 
Join Date: Nov 2007
Posts: 27
query accessing federated tables

hi guys
i have a problem in tuning a query.the following query is taking
long time.this query uses federated database tables(nick names).
this query is running against a data warehouse. and it is DB2V8.7(SOLARIS)
it has all the indexes it needed and also the runstats were updated.
i am not sure sure whether or not the optimizer is using the indexes present on the
tables of the federated database. i am posting the explain output also.
here is the query.and also plz let me know what all things i need to look into.


with temp1 as (
select FISCAL_PERIOD_YEAR_NB, FISCAL_PERIOD_Week_NB,
calendar_week_nb, PK_calendar_dt AS calendar_dt
from PS.CALENDAR_DATE_FISCAL_YEAR
where FISCAL_PERIOD_YEAR_NB =2007 and FISCAL_PERIOD_Week_NB =46),
BACK_90_DAYS_TEMP AS (
select FISCAL_PERIOD_YEAR_NB, FISCAL_PERIOD_Week_NB,
calendar_week_nb, calendar_dt, calendar_dt - 90 days as
Back_90_days_date
from temp1), VENDOR_ENTERED_PO_CREATED_3_MONTHS_AGO AS (
SELECT DISTINCT ORDREF.REFERENCE_NUMBER_TX
FROM dw.ORDER O INNER JOIN dw.TOUR_ORDER TORD ON
O.ORDER_ID =TORD.ORDER_ID AND O.STATUS_ID NOT IN (2029,
2019, 1230, 1330, 1226, 1858)INNER JOIN dw.TOUR_FACT
TF ON TORD.TOUR_ID =TF.TOUR_ID AND TF.PROJECT_ID =50 AND
TF.TOUR_STATUS_ID NOT IN (1543, 1330, 2379, 4725, 1323,
2070)INNER JOIN dw.ORDER_REFERENCE ORDREF ON
O.ORDER_ID =ORDREF.ORDER_ID AND ORDREF.TYPE_ID =2346 AND
ORDREF.STATUS_ID =1249
WHERE O.CREATE_USER_CD LIKE '250%'
ORDER BY ORDREF.REFERENCE_NUMBER_TX)
select *
from VENDOR_ENTERED_PO_CREATED_3_MONTHS_AGO





EXPLAIN OUTPUT:

==================== STATEMENT ==========================================

Isolation Level = Cursor Stability
Blocking = Block Unambiguous Cursors
Query Optimization Class = 7

Partition Parallel = No
Intra-Partition Parallel = No

SQL Path = "SYSIBM", "SYSFUN", "SYSPROC", "PS"


SQL Statement:

with temp1 as (
select FISCAL_PERIOD_YEAR_NB, FISCAL_PERIOD_Week_NB,
calendar_week_nb, PK_calendar_dt AS calendar_dt
from PS.CALENDAR_DATE_FISCAL_YEAR
where FISCAL_PERIOD_YEAR_NB =2007 and FISCAL_PERIOD_Week_NB =46),
BACK_90_DAYS_TEMP AS (
select FISCAL_PERIOD_YEAR_NB, FISCAL_PERIOD_Week_NB,
calendar_week_nb, calendar_dt, calendar_dt - 90 days as
Back_90_days_date
from temp1), VENDOR_ENTERED_PO_CREATED_3_MONTHS_AGO AS (
SELECT DISTINCT ORDREF.REFERENCE_NUMBER_TX
FROM dw.ORDER O INNER JOIN dw.TOUR_ORDER TORD ON
O.ORDER_ID =TORD.ORDER_ID AND O.STATUS_ID NOT IN (2029,
2019, 1230, 1330, 1226, 1858)INNER JOIN dw.TOUR_FACT
TF ON TORD.TOUR_ID =TF.TOUR_ID AND TF.PROJECT_ID =50 AND
TF.TOUR_STATUS_ID NOT IN (1543, 1330, 2379, 4725, 1323,
2070)INNER JOIN dw.ORDER_REFERENCE ORDREF ON
O.ORDER_ID =ORDREF.ORDER_ID AND ORDREF.TYPE_ID =2346 AND
ORDREF.STATUS_ID =1249
WHERE O.CREATE_USER_CD LIKE '250%'
ORDER BY ORDREF.REFERENCE_NUMBER_TX)
select *
from VENDOR_ENTERED_PO_CREATED_3_MONTHS_AGO


Section Code Page = 819

Estimated Cost = 880334.625000
Estimated Cardinality = 30.802761

Ship Distributed Subquery #1
| #Columns = 1
Return Data to Application
| #Columns = 1

Distributed Substatement #1:
Server: dw (DB2/UDB 8.2)
SQL Statement:

SELECT DISTINCT A1."REFERENCE_NUMBER_TX"
FROM "dw"."TOUR_ORDER"A0, "dw"."ORDER_REFERENCE"A1, "FACT"."
TOUR_FACT"A2, "dw"."ORDER"A3
WHERE (A1."STATUS_ID"=1249)AND (A1."TYPE_ID"=2346)AND (A0."ORDER_ID"
=A1."ORDER_ID")AND (NOT((A2."TOUR_STATUS_ID"IN (1543, 1330,
2379, 4725, 1323, 2070))))AND (A2."PROJECT_ID"=50)AND (A0."
TOUR_ID"=A2."TOUR_ID")AND (A3."CREATE_USER_CD"LIKE '250%')
AND (A1."ORDER_ID"=A3."ORDER_ID")AND (NOT((A3."STATUS_ID"IN
(2029, 2019, 1230, 1330, 1226, 1858))))
ORDER BY 1 ASC
FOR READ ONLY

Nicknames Referenced:
dw.ORDER ID = 32818
Base = dw.ORDER
dw.TOUR_FACT ID = 32828
Base = FACT.TOUR_FACT
dw.ORDER_REFERENCE ID = 32815
Base = dw.ORDER_REFERENCE
dw.TOUR_ORDER ID = 32811
Base = dw.TOUR_ORDER

#Output Columns = 1

End of section


Optimizer Plan:

RETURN
( 1)
|
SHIP
( 2)
+----------+---------+---+--------------+
Nickname: Nickname: Nickname: Nickname:
dw dw dw dw
ORDER TOUR_FACT ORDER_REFERENCE TOUR_ORDER



Thanks.
Reply With Quote
  #2 (permalink)  
Old 01-03-08, 06:01
stolze stolze is offline
Registered User
 
Join Date: Jan 2007
Location: Jena, Germany
Posts: 2,662
Quote:
Originally Posted by rajaryan4545
hi guys
i have a problem in tuning a query.the following query is taking
long time.this query uses federated database tables(nick names).
this query is running against a data warehouse. and it is DB2V8.7(SOLARIS)
it has all the indexes it needed and also the runstats were updated.
i am not sure sure whether or not the optimizer is using the indexes present on the
tables of the federated database. i am posting the explain output also.
here is the query.and also plz let me know what all things i need to look into.
With a bit of reformatting, I think it's easier to understand your query:
Code:
WITH
   temp1 AS (
      SELECT fiscal_period_year_nb, fiscal_period_week_nb,
             calendar_week_nb, pk_calendar_dt AS calendar_dt
      FROM   ps.calendar_date_fiscal_year
      WHERE  fiscal_period_year_nb = 2007 AND
             fiscal_period_week_nb = 46),
   back_90_days_temp AS (
      SELECT fiscal_period_year_nb, fiscal_period_week_nb,
             calendar_week_nb, calendar_dt, calendar_dt - 90 DAYS AS back_90_days_date
      FROM   temp1),
   vendor_entered_po_created_3_months_ago AS (
      SELECT DISTINCT ordref.reference_number_tx
      FROM   dw.order AS o
                INNER JOIN
             dw.tour_order AS tord ON ( o.order_id = tord.order_id AND
                                        o.status_id NOT IN ( 2029, 2019, 1230, 1330, 1226, 1858 ) )
                INNER JOIN
             dw.tour_fact AS tf ON ( tord.tour_id =tf.tour_id AND
                                     tf.project_id = 50 AND
                                     tf.tour_status_id NOT IN ( 1543, 1330, 2379, 4725, 1323, 2070 ) )
                INNER JOIN
             dw.order_reference AS ordref ON ( o.order_id = ordref.order_id AND
                                               ordref.type_id = 2346 AND
                                               ordref.status_id = 1249 )
      WHERE  o.create_user_cd LIKE '250%'
      ORDER BY ordref.reference_number_tx )
SELECT *
FROM   vendor_entered_po_created_3_months_ago
There are a few things to note here:
  • You don't use the CTE for BACK_90_DAYS_TEMP at all. And TEMP1 is only required for BACK_90_DAYS_TEMP (so TEMP1 isn't used either). Is there a particular reason why you added it to the query?
  • The ORDER BY clause in the 3rd CTE doesn't make any sense. A CTE produces a (temp) table and a table is just a set of rows. (And per definition, a set is not sorted in any way.) In the explain output below, you can see that the ORDER BY is actually shipped to the remote data source. That is a good performance killer because the data sources has to materialize the whole result first before passing it back to the federated server. Removing the ORDER BY would be a good idea, unless you really need it (in which case it belongs to the outer-most SELECT.
  • Do you really need the DISTINCT? The same things as for the ORDER BY apply here. The DISTINCT keyword will cause the materialization of the result set to happen at the remote data source. If you can get rid of the DISTINCT and the ORDER BY, you may get a piped/streamed access plan.
  • Your join predicates contain things like "o.status_id NOT IN ( 2029, 2019, 1230, 1330, 1226, 1858 )" or "ordref.type_id = 2346". Those are not join predicates but rather filters, and I would put such filters into the WHERE clause. This wouldn't make any difference performance-wise, however. It is just clearer and easier to understand.

Besides that, you can see from the explain output that the whole query is directly processed by the remote data source. No work is done at the federated server, except passing the results through.
Quote:

Ship Distributed Subquery #1
| #Columns = 1
Return Data to Application
| #Columns = 1

Distributed Substatement #1:
Server: dw (DB2/UDB 8.2)
SQL Statement:

Code:
     SELECT DISTINCT A1."REFERENCE_NUMBER_TX"
     FROM "dw"."TOUR_ORDER"A0, "dw"."ORDER_REFERENCE"A1, "FACT"."
             TOUR_FACT"A2, "dw"."ORDER"A3
     WHERE (A1."STATUS_ID"=1249)AND (A1."TYPE_ID"=2346)AND (A0."ORDER_ID"
             =A1."ORDER_ID")AND (NOT((A2."TOUR_STATUS_ID"IN (1543, 1330,
             2379, 4725, 1323, 2070))))AND (A2."PROJECT_ID"=50)AND (A0."
             TOUR_ID"=A2."TOUR_ID")AND (A3."CREATE_USER_CD"LIKE '250%')
             AND (A1."ORDER_ID"=A3."ORDER_ID")AND (NOT((A3."STATUS_ID"IN
             (2029, 2019, 1230, 1330, 1226, 1858))))
     ORDER BY 1 ASC
     FOR READ ONLY


Optimizer Plan:

                       RETURN
                       (   1)
                         |
                        SHIP
                       (   2)
    +----------+---------+---+--------------+
 Nickname:  Nickname:  Nickname:        Nickname:
 dw      dw      dw            dw
 ORDER      TOUR_FACT  ORDER_REFERENCE  TOUR_ORDER
__________________
Knut Stolze
IBM DB2 Analytics Accelerator
IBM Germany Research & Development
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On