Results 1 to 2 of 2
  1. #1
    Join Date
    Nov 2007
    Posts
    27

    Unanswered: query accessing federated tables

    hi guys
    i have a problem in tuning a query.the following query is taking
    long time.this query uses federated database tables(nick names).
    this query is running against a data warehouse. and it is DB2V8.7(SOLARIS)
    it has all the indexes it needed and also the runstats were updated.
    i am not sure sure whether or not the optimizer is using the indexes present on the
    tables of the federated database. i am posting the explain output also.
    here is the query.and also plz let me know what all things i need to look into.


    with temp1 as (
    select FISCAL_PERIOD_YEAR_NB, FISCAL_PERIOD_Week_NB,
    calendar_week_nb, PK_calendar_dt AS calendar_dt
    from PS.CALENDAR_DATE_FISCAL_YEAR
    where FISCAL_PERIOD_YEAR_NB =2007 and FISCAL_PERIOD_Week_NB =46),
    BACK_90_DAYS_TEMP AS (
    select FISCAL_PERIOD_YEAR_NB, FISCAL_PERIOD_Week_NB,
    calendar_week_nb, calendar_dt, calendar_dt - 90 days as
    Back_90_days_date
    from temp1), VENDOR_ENTERED_PO_CREATED_3_MONTHS_AGO AS (
    SELECT DISTINCT ORDREF.REFERENCE_NUMBER_TX
    FROM dw.ORDER O INNER JOIN dw.TOUR_ORDER TORD ON
    O.ORDER_ID =TORD.ORDER_ID AND O.STATUS_ID NOT IN (2029,
    2019, 1230, 1330, 1226, 1858)INNER JOIN dw.TOUR_FACT
    TF ON TORD.TOUR_ID =TF.TOUR_ID AND TF.PROJECT_ID =50 AND
    TF.TOUR_STATUS_ID NOT IN (1543, 1330, 2379, 4725, 1323,
    2070)INNER JOIN dw.ORDER_REFERENCE ORDREF ON
    O.ORDER_ID =ORDREF.ORDER_ID AND ORDREF.TYPE_ID =2346 AND
    ORDREF.STATUS_ID =1249
    WHERE O.CREATE_USER_CD LIKE '250%'
    ORDER BY ORDREF.REFERENCE_NUMBER_TX)
    select *
    from VENDOR_ENTERED_PO_CREATED_3_MONTHS_AGO





    EXPLAIN OUTPUT:

    ==================== STATEMENT ==========================================

    Isolation Level = Cursor Stability
    Blocking = Block Unambiguous Cursors
    Query Optimization Class = 7

    Partition Parallel = No
    Intra-Partition Parallel = No

    SQL Path = "SYSIBM", "SYSFUN", "SYSPROC", "PS"


    SQL Statement:

    with temp1 as (
    select FISCAL_PERIOD_YEAR_NB, FISCAL_PERIOD_Week_NB,
    calendar_week_nb, PK_calendar_dt AS calendar_dt
    from PS.CALENDAR_DATE_FISCAL_YEAR
    where FISCAL_PERIOD_YEAR_NB =2007 and FISCAL_PERIOD_Week_NB =46),
    BACK_90_DAYS_TEMP AS (
    select FISCAL_PERIOD_YEAR_NB, FISCAL_PERIOD_Week_NB,
    calendar_week_nb, calendar_dt, calendar_dt - 90 days as
    Back_90_days_date
    from temp1), VENDOR_ENTERED_PO_CREATED_3_MONTHS_AGO AS (
    SELECT DISTINCT ORDREF.REFERENCE_NUMBER_TX
    FROM dw.ORDER O INNER JOIN dw.TOUR_ORDER TORD ON
    O.ORDER_ID =TORD.ORDER_ID AND O.STATUS_ID NOT IN (2029,
    2019, 1230, 1330, 1226, 1858)INNER JOIN dw.TOUR_FACT
    TF ON TORD.TOUR_ID =TF.TOUR_ID AND TF.PROJECT_ID =50 AND
    TF.TOUR_STATUS_ID NOT IN (1543, 1330, 2379, 4725, 1323,
    2070)INNER JOIN dw.ORDER_REFERENCE ORDREF ON
    O.ORDER_ID =ORDREF.ORDER_ID AND ORDREF.TYPE_ID =2346 AND
    ORDREF.STATUS_ID =1249
    WHERE O.CREATE_USER_CD LIKE '250%'
    ORDER BY ORDREF.REFERENCE_NUMBER_TX)
    select *
    from VENDOR_ENTERED_PO_CREATED_3_MONTHS_AGO


    Section Code Page = 819

    Estimated Cost = 880334.625000
    Estimated Cardinality = 30.802761

    Ship Distributed Subquery #1
    | #Columns = 1
    Return Data to Application
    | #Columns = 1

    Distributed Substatement #1:
    Server: dw (DB2/UDB 8.2)
    SQL Statement:

    SELECT DISTINCT A1."REFERENCE_NUMBER_TX"
    FROM "dw"."TOUR_ORDER"A0, "dw"."ORDER_REFERENCE"A1, "FACT"."
    TOUR_FACT"A2, "dw"."ORDER"A3
    WHERE (A1."STATUS_ID"=1249)AND (A1."TYPE_ID"=2346)AND (A0."ORDER_ID"
    =A1."ORDER_ID")AND (NOT((A2."TOUR_STATUS_ID"IN (1543, 1330,
    2379, 4725, 1323, 2070))))AND (A2."PROJECT_ID"=50)AND (A0."
    TOUR_ID"=A2."TOUR_ID")AND (A3."CREATE_USER_CD"LIKE '250%')
    AND (A1."ORDER_ID"=A3."ORDER_ID")AND (NOT((A3."STATUS_ID"IN
    (2029, 2019, 1230, 1330, 1226, 1858))))
    ORDER BY 1 ASC
    FOR READ ONLY

    Nicknames Referenced:
    dw.ORDER ID = 32818
    Base = dw.ORDER
    dw.TOUR_FACT ID = 32828
    Base = FACT.TOUR_FACT
    dw.ORDER_REFERENCE ID = 32815
    Base = dw.ORDER_REFERENCE
    dw.TOUR_ORDER ID = 32811
    Base = dw.TOUR_ORDER

    #Output Columns = 1

    End of section


    Optimizer Plan:

    RETURN
    ( 1)
    |
    SHIP
    ( 2)
    +----------+---------+---+--------------+
    Nickname: Nickname: Nickname: Nickname:
    dw dw dw dw
    ORDER TOUR_FACT ORDER_REFERENCE TOUR_ORDER



    Thanks.

  2. #2
    Join Date
    Jan 2007
    Location
    Jena, Germany
    Posts
    2,721
    Quote Originally Posted by rajaryan4545
    hi guys
    i have a problem in tuning a query.the following query is taking
    long time.this query uses federated database tables(nick names).
    this query is running against a data warehouse. and it is DB2V8.7(SOLARIS)
    it has all the indexes it needed and also the runstats were updated.
    i am not sure sure whether or not the optimizer is using the indexes present on the
    tables of the federated database. i am posting the explain output also.
    here is the query.and also plz let me know what all things i need to look into.
    With a bit of reformatting, I think it's easier to understand your query:
    Code:
    WITH
       temp1 AS (
          SELECT fiscal_period_year_nb, fiscal_period_week_nb,
                 calendar_week_nb, pk_calendar_dt AS calendar_dt
          FROM   ps.calendar_date_fiscal_year
          WHERE  fiscal_period_year_nb = 2007 AND
                 fiscal_period_week_nb = 46),
       back_90_days_temp AS (
          SELECT fiscal_period_year_nb, fiscal_period_week_nb,
                 calendar_week_nb, calendar_dt, calendar_dt - 90 DAYS AS back_90_days_date
          FROM   temp1),
       vendor_entered_po_created_3_months_ago AS (
          SELECT DISTINCT ordref.reference_number_tx
          FROM   dw.order AS o
                    INNER JOIN
                 dw.tour_order AS tord ON ( o.order_id = tord.order_id AND
                                            o.status_id NOT IN ( 2029, 2019, 1230, 1330, 1226, 1858 ) )
                    INNER JOIN
                 dw.tour_fact AS tf ON ( tord.tour_id =tf.tour_id AND
                                         tf.project_id = 50 AND
                                         tf.tour_status_id NOT IN ( 1543, 1330, 2379, 4725, 1323, 2070 ) )
                    INNER JOIN
                 dw.order_reference AS ordref ON ( o.order_id = ordref.order_id AND
                                                   ordref.type_id = 2346 AND
                                                   ordref.status_id = 1249 )
          WHERE  o.create_user_cd LIKE '250%'
          ORDER BY ordref.reference_number_tx )
    SELECT *
    FROM   vendor_entered_po_created_3_months_ago
    There are a few things to note here:
    • You don't use the CTE for BACK_90_DAYS_TEMP at all. And TEMP1 is only required for BACK_90_DAYS_TEMP (so TEMP1 isn't used either). Is there a particular reason why you added it to the query?
    • The ORDER BY clause in the 3rd CTE doesn't make any sense. A CTE produces a (temp) table and a table is just a set of rows. (And per definition, a set is not sorted in any way.) In the explain output below, you can see that the ORDER BY is actually shipped to the remote data source. That is a good performance killer because the data sources has to materialize the whole result first before passing it back to the federated server. Removing the ORDER BY would be a good idea, unless you really need it (in which case it belongs to the outer-most SELECT.
    • Do you really need the DISTINCT? The same things as for the ORDER BY apply here. The DISTINCT keyword will cause the materialization of the result set to happen at the remote data source. If you can get rid of the DISTINCT and the ORDER BY, you may get a piped/streamed access plan.
    • Your join predicates contain things like "o.status_id NOT IN ( 2029, 2019, 1230, 1330, 1226, 1858 )" or "ordref.type_id = 2346". Those are not join predicates but rather filters, and I would put such filters into the WHERE clause. This wouldn't make any difference performance-wise, however. It is just clearer and easier to understand.


    Besides that, you can see from the explain output that the whole query is directly processed by the remote data source. No work is done at the federated server, except passing the results through.

    Ship Distributed Subquery #1
    | #Columns = 1
    Return Data to Application
    | #Columns = 1

    Distributed Substatement #1:
    Server: dw (DB2/UDB 8.2)
    SQL Statement:

    Code:
         SELECT DISTINCT A1."REFERENCE_NUMBER_TX"
         FROM "dw"."TOUR_ORDER"A0, "dw"."ORDER_REFERENCE"A1, "FACT"."
                 TOUR_FACT"A2, "dw"."ORDER"A3
         WHERE (A1."STATUS_ID"=1249)AND (A1."TYPE_ID"=2346)AND (A0."ORDER_ID"
                 =A1."ORDER_ID")AND (NOT((A2."TOUR_STATUS_ID"IN (1543, 1330,
                 2379, 4725, 1323, 2070))))AND (A2."PROJECT_ID"=50)AND (A0."
                 TOUR_ID"=A2."TOUR_ID")AND (A3."CREATE_USER_CD"LIKE '250%')
                 AND (A1."ORDER_ID"=A3."ORDER_ID")AND (NOT((A3."STATUS_ID"IN
                 (2029, 2019, 1230, 1330, 1226, 1858))))
         ORDER BY 1 ASC
         FOR READ ONLY
    
    
    Optimizer Plan:
    
                           RETURN
                           (   1)
                             |
                            SHIP
                           (   2)
        +----------+---------+---+--------------+
     Nickname:  Nickname:  Nickname:        Nickname:
     dw      dw      dw            dw
     ORDER      TOUR_FACT  ORDER_REFERENCE  TOUR_ORDER
    Knut Stolze
    IBM DB2 Analytics Accelerator
    IBM Germany Research & Development

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •