Results 1 to 2 of 2
  1. #1
    Join Date
    Jan 2003
    Location
    Zutphen,Netherlands
    Posts
    256

    Question Unanswered: Resetting outer join to normal join lowers performance

    I am busy tweaking a datamart that contains a grouping table (dimension table) which is joined to a facttable through an outer join. All data from facttable is returned even if there is no entry in the dimensiontable. This causes questions in reporting, so we added all missing entries in the dimensiontable and would like to change the join-type to an inner join for performance-sake.

    Creating the join as an inner join however resulted in 6 times slower performance overall and a completely different access path for the query (as explain plan shows)

    1. OUTER JOIN QUERY:

    SELECT
    BIKE_SALES_ORDERS_AGGR.FISC_YEAR,
    BIKE_SALES_ORDERS_AGGR.FISC_MONTH,
    BIKE_SALES_ORDERS_AGGR.FISC_PERIOD,
    BIKE_SALES_ORDERS_AGGR.CCOUN,
    sum(BIKE_SALES_ORDERS_AGGR.QTY_SALES),
    sum(BIKE_SALES_ORDERS_AGGR.VAL_SALES),
    DIM_SALESMEN_COMP.SALES_REP_ID,
    DIM_SALESMEN_COMP.SALES_REP_NAME,
    DIM_SALESMEN_COMP.SALES_REP_NAME,
    BIKE_IREF01_TRANSLATION.IREF01,
    BIKE_IREF01_TRANSLATION.DESCRIPTION,
    BIKE_IREF01_TRANSLATION.IABBT,
    BIKE_IREF01_TRANSLATION.CATEGORY,
    sum(BIKE_SALES_ORDERS_AGGR.VAL_ORDER),
    TIME_DIST_PERIOD.FISC_PERIOD_REL,
    TIME_DIST_PERIOD.MONTHNAME,
    BIKE_SALES_ORDERS_AGGR.CUSTOMER
    FROM
    DIM_SALESMEN_COMP,
    BIKE_SALES_ORDERS_AGGR LEFT OUTER JOIN BIKE_IREF01_TRANSLATION ON BIKE_IREF01_TRANSLATION.IREF01=BIKE_SALES_ORDERS_A GGR.REF01,
    TIME_DIST_PERIOD
    WHERE
    ( BIKE_SALES_ORDERS_AGGR.SALES_REP=DIM_SALESMEN_COMP .SALES_REP_ID )
    AND ( BIKE_SALES_ORDERS_AGGR.FISC_PERIOD=TIME_DIST_PERIO D.FISC_PERIOD )
    AND (
    BIKE_SALES_ORDERS_AGGR.FISC_PERIOD IN (200202,200203,200204,200205,200206,200207,200208, 200209,200210,200211,200212,200301,200302,200303)
    AND BIKE_SALES_ORDERS_AGGR.CUSTOMER <> 333333
    AND BIKE_SALES_ORDERS_AGGR.CUSTOMER BETWEEN 200000 AND 500000
    AND DIM_SALESMEN_COMP.SALES_REP_ID IN (1,2,3,4,5,4001,4002,4003,4004,4005)
    )
    GROUP BY
    BIKE_SALES_ORDERS_AGGR.FISC_YEAR,
    BIKE_SALES_ORDERS_AGGR.FISC_MONTH,
    BIKE_SALES_ORDERS_AGGR.FISC_PERIOD,
    BIKE_SALES_ORDERS_AGGR.CCOUN,
    DIM_SALESMEN_COMP.SALES_REP_ID,
    DIM_SALESMEN_COMP.SALES_REP_NAME,
    DIM_SALESMEN_COMP.SALES_REP_NAME,
    BIKE_IREF01_TRANSLATION.IREF01,
    BIKE_IREF01_TRANSLATION.DESCRIPTION,
    BIKE_IREF01_TRANSLATION.IABBT,
    BIKE_IREF01_TRANSLATION.CATEGORY,
    TIME_DIST_PERIOD.FISC_PERIOD_REL,
    TIME_DIST_PERIOD.MONTHNAME,
    BIKE_SALES_ORDERS_AGGR.CUSTOMER

    2. INNER JOIN QUERY:

    SELECT
    BIKE_SALES_ORDERS_AGGR.FISC_YEAR,
    BIKE_SALES_ORDERS_AGGR.FISC_MONTH,
    BIKE_SALES_ORDERS_AGGR.FISC_PERIOD,
    BIKE_SALES_ORDERS_AGGR.CCOUN,
    sum(BIKE_SALES_ORDERS_AGGR.QTY_SALES),
    sum(BIKE_SALES_ORDERS_AGGR.VAL_SALES),
    DIM_SALESMEN_COMP.SALES_REP_ID,
    DIM_SALESMEN_COMP.SALES_REP_NAME,
    DIM_SALESMEN_COMP.SALES_REP_NAME,
    BIKE_IREF01_TRANSLATION.IREF01,
    BIKE_IREF01_TRANSLATION.DESCRIPTION,
    BIKE_IREF01_TRANSLATION.IABBT,
    BIKE_IREF01_TRANSLATION.CATEGORY,
    sum(BIKE_SALES_ORDERS_AGGR.VAL_ORDER),
    TIME_DIST_PERIOD.FISC_PERIOD_REL,
    TIME_DIST_PERIOD.MONTHNAME,
    BIKE_SALES_ORDERS_AGGR.CUSTOMER
    FROM
    BIKE_SALES_ORDERS_AGGR,
    DIM_SALESMEN_COMP,
    BIKE_IREF01_TRANSLATION,
    TIME_DIST_PERIOD
    WHERE
    ( BIKE_SALES_ORDERS_AGGR.SALES_REP=DIM_SALESMEN_COMP .SALES_REP_ID )
    AND ( BIKE_IREF01_TRANSLATION.IREF01=BIKE_SALES_ORDERS_A GGR.REF01 )
    AND ( BIKE_SALES_ORDERS_AGGR.FISC_PERIOD=TIME_DIST_PERIO D.FISC_PERIOD )
    AND (
    BIKE_SALES_ORDERS_AGGR.FISC_PERIOD IN (200202,200203,200204,200205,200206,200207,200208, 200209,200210,200211,200212,200301,200302,200303)
    AND BIKE_SALES_ORDERS_AGGR.CUSTOMER <> 333333
    AND BIKE_SALES_ORDERS_AGGR.CUSTOMER BETWEEN 200000 AND 500000
    AND DIM_SALESMEN_COMP.SALES_REP_ID IN (1,2,3,4,5,4001,4002,4003,4004,4005)
    )
    GROUP BY
    BIKE_SALES_ORDERS_AGGR.FISC_YEAR,
    BIKE_SALES_ORDERS_AGGR.FISC_MONTH,
    BIKE_SALES_ORDERS_AGGR.FISC_PERIOD,
    BIKE_SALES_ORDERS_AGGR.CCOUN,
    DIM_SALESMEN_COMP.SALES_REP_ID,
    DIM_SALESMEN_COMP.SALES_REP_NAME,
    DIM_SALESMEN_COMP.SALES_REP_NAME,
    BIKE_IREF01_TRANSLATION.IREF01,
    BIKE_IREF01_TRANSLATION.DESCRIPTION,
    BIKE_IREF01_TRANSLATION.IABBT,
    BIKE_IREF01_TRANSLATION.CATEGORY,
    TIME_DIST_PERIOD.FISC_PERIOD_REL,
    TIME_DIST_PERIOD.MONTHNAME,
    BIKE_SALES_ORDERS_AGGR.CUSTOMER


    where the facttable is BIKE_SALES_ORDERS_AGGR
    dimension tables are DIM_SALESMEN_COMP, BIKE_IREF01_TRANSLATION
    and TIME_DIST_PERIOD is a view on a calendertable

    The inner join query starts with a full table scan on the facttable, whereas the outer query uses both indexes on DIM_SALESMEN_COMP and on BIKE_SALES_ORDERS_AGGR

    This result is quite the opposite from what I expected, cause usually outer joins are detrimental for using indexes Anyone a clue.......?
    Ties Blom
    Senior Application Developer BI
    Getronics Healthcare
    DB2,ORACLE,Powercenter,BusObj,Access,
    SQL, SQL server

  2. #2
    Join Date
    Oct 2001
    Location
    Bangalore
    Posts
    186
    Hi,

    I guess therotically that is true and it also depends on the volume and type of operations the database has to perform. But in a Warehouse of your design the inner join goes for pairing of each row with fact table whcih is more costly.

    Regarding use of indexes can you list the indexes as well in use.

    Cheers,
    Prashant

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •