Results 1 to 8 of 8
  1. #1
    Join Date
    Feb 2004
    Posts
    21

    Unanswered: Huge difference in performance, from one condition change?

    Gurus,

    Take a look at the queries below. The first 1, takes at least 10 minutes, while the 2nd, in 10 seconds.

    Code:
    SELECT
      LU_STOCK.STOCK_ID,
      LU_STOCK_INFO.STOCK_SNAME,
      LU_STOCK_INFO.STOCK_LNAME,
      LU_STOCK.COMPANY_IND,
      LU_TRADING_DAY.DAY_ID,
      LU_TRADING_DAY.DATESTART,
      MAX(FT_STOCK_TXN.CLOSE_PRICE),
      SUM(FT_STOCK_TXN.VOLUME),
      SUM(FT_STOCK_TXN.VALUE),
      SUM(CASE WHEN FT_STOCK_TXN.OFF_MKT_VOL IS NULL THEN 0 ELSE FT_STOCK_TXN.OFF_MKT_VOL END),
      SUM(CASE WHEN FT_STOCK_TXN.OFF_MKT_VAL IS NULL THEN 0 ELSE FT_STOCK_TXN.OFF_MKT_VAL END),
      LU_STATUS.STATUS_ID
    FROM
      LU_MONTH  LU_TRADING_MTH INNER JOIN LU_DAY  LU_TRADING_DAY ON (LU_TRADING_DAY.MONTH_ID=LU_TRADING_MTH.MONTH_ID)
       INNER JOIN FT_STOCK_TXN ON (LU_TRADING_DAY.DAY_ID=FT_STOCK_TXN.TRADE_DATE)
       INNER JOIN LU_STOCK ON (LU_STOCK.STOCK_ID=FT_STOCK_TXN.STOCK_ID)
       INNER JOIN LU_COMPANY ON (LU_STOCK.PARENT_ID=LU_COMPANY.STOCK_ID)
       INNER JOIN LU_DAY  LU_SNAPENDDAY ON (LU_COMPANY.END_DATE>=LU_SNAPENDDAY.DAY_ID)
       INNER JOIN LU_DAY  LU_SNAPSTARTDAY ON (LU_COMPANY.START_DATE<=LU_SNAPSTARTDAY.DAY_ID)
       INNER JOIN LU_STATUS ON (LU_STATUS.STATUS_ID=LU_STOCK.STATUS_ID)
       INNER JOIN LU_STOCK  LU_STOCK_INFO ON (LU_STOCK_INFO.STOCK_ID=LU_STOCK.STOCK_ID)
       INNER JOIN LU_DAY  LU_SNAPENDDAY_STK_INFO ON (LU_STOCK_INFO.END_DATE>=LU_SNAPENDDAY_STK_INFO.DAY_ID)
       INNER JOIN LU_DAY  LU_SNAPSTARTDAY_STK_INFO ON (LU_STOCK_INFO.START_DATE<=LU_SNAPSTARTDAY_STK_INFO.DAY_ID)
      
    WHERE
    ( LU_SNAPENDDAY.DAY_ID=LU_SNAPENDDAY_STK_INFO.DAY_ID AND LU_SNAPSTARTDAY.DAY_ID=LU_SNAPSTARTDAY_STK_INFO.DAY_ID  )
      AND  ( LU_TRADING_DAY.DAY_ID BETWEEN LU_STOCK.START_DATE AND LU_STOCK.END_DATE  )
      AND  
      (
       LU_TRADING_DAY.DAY_ID  BETWEEN  LU_SNAPSTARTDAY.DAY_ID  AND  LU_SNAPENDDAY.DAY_ID
       AND
       ( LU_TRADING_MTH.MONTH_ID  = 200401)
       AND
       LU_COMPANY.STOCK_ID  In  
         (
         SELECT
           distinct LU_COMPANY.STOCK_ID
         FROM
           LU_STOCK INNER JOIN LU_COMPANY ON (LU_STOCK.PARENT_ID=LU_COMPANY.STOCK_ID)
           
         WHERE
           ( ( RTRIM(LU_STOCK.STOCK_ID) ) = '5169')
         )
      )
    GROUP BY
      LU_STOCK.STOCK_ID, 
      LU_STOCK_INFO.STOCK_SNAME, 
      LU_STOCK_INFO.STOCK_LNAME, 
      LU_STOCK.COMPANY_IND, 
      LU_TRADING_DAY.DAY_ID, 
      LU_TRADING_DAY.DATESTART, 
      LU_STATUS.STATUS_ID
    Code:
    SELECT
      LU_STOCK.STOCK_ID,
      LU_STOCK_INFO.STOCK_SNAME,
      LU_STOCK_INFO.STOCK_LNAME,
      LU_STOCK.COMPANY_IND,
      LU_TRADING_DAY.DAY_ID,
      LU_TRADING_DAY.DATESTART,
      MAX(FT_STOCK_TXN.CLOSE_PRICE),
      SUM(FT_STOCK_TXN.VOLUME),
      SUM(FT_STOCK_TXN.VALUE),
      SUM(CASE WHEN FT_STOCK_TXN.OFF_MKT_VOL IS NULL THEN 0 ELSE FT_STOCK_TXN.OFF_MKT_VOL END),
      SUM(CASE WHEN FT_STOCK_TXN.OFF_MKT_VAL IS NULL THEN 0 ELSE FT_STOCK_TXN.OFF_MKT_VAL END),
      LU_STATUS.STATUS_ID
    FROM
      LU_MONTH  LU_TRADING_MTH INNER JOIN LU_DAY  LU_TRADING_DAY ON (LU_TRADING_DAY.MONTH_ID=LU_TRADING_MTH.MONTH_ID)
       INNER JOIN FT_STOCK_TXN ON (LU_TRADING_DAY.DAY_ID=FT_STOCK_TXN.TRADE_DATE)
       INNER JOIN LU_STOCK ON (LU_STOCK.STOCK_ID=FT_STOCK_TXN.STOCK_ID)
       INNER JOIN LU_COMPANY ON (LU_STOCK.PARENT_ID=LU_COMPANY.STOCK_ID)
       INNER JOIN LU_DAY  LU_SNAPENDDAY ON (LU_COMPANY.END_DATE>=LU_SNAPENDDAY.DAY_ID)
       INNER JOIN LU_DAY  LU_SNAPSTARTDAY ON (LU_COMPANY.START_DATE<=LU_SNAPSTARTDAY.DAY_ID)
       INNER JOIN LU_STATUS ON (LU_STATUS.STATUS_ID=LU_STOCK.STATUS_ID)
       INNER JOIN LU_STOCK  LU_STOCK_INFO ON (LU_STOCK_INFO.STOCK_ID=LU_STOCK.STOCK_ID)
       INNER JOIN LU_DAY  LU_SNAPENDDAY_STK_INFO ON (LU_STOCK_INFO.END_DATE>=LU_SNAPENDDAY_STK_INFO.DAY_ID)
       INNER JOIN LU_DAY  LU_SNAPSTARTDAY_STK_INFO ON (LU_STOCK_INFO.START_DATE<=LU_SNAPSTARTDAY_STK_INFO.DAY_ID)
      
    WHERE
    ( LU_SNAPENDDAY.DAY_ID=LU_SNAPENDDAY_STK_INFO.DAY_ID AND LU_SNAPSTARTDAY.DAY_ID=LU_SNAPSTARTDAY_STK_INFO.DAY_ID  )
      AND  ( LU_TRADING_DAY.DAY_ID BETWEEN LU_STOCK.START_DATE AND LU_STOCK.END_DATE  )
      AND  
      (
       LU_TRADING_DAY.DAY_ID  BETWEEN  LU_SNAPSTARTDAY.DAY_ID  AND  LU_SNAPENDDAY.DAY_ID
       AND
       ( LU_TRADING_MTH.MONTH_ID  > 200401-1)
       AND
       ( LU_TRADING_MTH.MONTH_ID < 200401+1)
       AND
       LU_COMPANY.STOCK_ID  In  
         (
         SELECT
           distinct LU_COMPANY.STOCK_ID
         FROM
           LU_STOCK INNER JOIN LU_COMPANY ON (LU_STOCK.PARENT_ID=LU_COMPANY.STOCK_ID)
           
         WHERE
           ( ( RTRIM(LU_STOCK.STOCK_ID) ) = '5169')
         )
      )
    GROUP BY
      LU_STOCK.STOCK_ID, 
      LU_STOCK_INFO.STOCK_SNAME, 
      LU_STOCK_INFO.STOCK_LNAME, 
      LU_STOCK.COMPANY_IND, 
      LU_TRADING_DAY.DAY_ID, 
      LU_TRADING_DAY.DATESTART, 
      LU_STATUS.STATUS_ID
    In short, the first query has this condition:
    ( LU_TRADING_MTH.MONTH_ID = 200401)
    The second, this condition was changed to:
    ( LU_TRADING_MTH.MONTH_ID > 200401-1)
    AND
    ( LU_TRADING_MTH.MONTH_ID < 200401+1)

    I've also attached screenshots of its invidual explain plans. Whats going on?
    Attached Thumbnails Attached Thumbnails Slow1.gif   Slow2.gif   Fast1.gif   Fast2.gif  

  2. #2
    Join Date
    Aug 2001
    Location
    UK
    Posts
    4,650
    I assume you have an index on LU_TRADING_MTH.MONTH_ID. The first condition is an index scan and the second one is a tablescan ... because you select everything except 200401.

    If you do no have index on the column LU_TRADING_MTH.MONTH_ID, then either way it is a tablescan - but in the second occasion you are fetching the records to your front end which may indicate the difference. (depends on what front tool you use).

    Also, as you are joining with many other tables, as the number of rows in one of the tables increase, joining with another table obviously will increase the time ..

    HTH

    Sathyaram
    Visit the new-look IDUG Website , register to gain access to the excellent content.

  3. #3
    Join Date
    Feb 2004
    Posts
    21
    Thanks for the prompt reply.

    Yup, I do have an index on LU_TRADING_MTH.MONTH_ID, and LU_TRADING_DAY.MONTH_ID is a foreign key, referencing LU_TRADING_MTH. Ultimate both queries returns rows for the month 200401. Sorry for my ignorance, but if I already have an index, why would it still be a tablescan (2nd statement)?

    I totally understand that with the number of tables/rows involved, this query may be slow. Unfortunately I can't change the query much. I can only play around with the month condition. You mentioned 'front tool'.. How does a tool play a factor in this? But anyway, this query was generated from a reporting tool, and was also copied out and executed in a query tool. Both connects via ODBC, and both takes almost the same duration.

    What can I do?

  4. #4
    Join Date
    Aug 2001
    Location
    UK
    Posts
    4,650
    Oh, I did misread your conditions ..Now it makes sense why the question ...
    Well, AFAIK, the optimizer is not(yet) intelligent enough to understand, even if it is an int column, then
    col > x-1 and col<x+1 is the same as col=x. These are stage-2 predicates.
    About the front end tools, when executing a query from them, from your perception the execution time is the total time taken from the moment you fire the query to when you see the result set.
    If your client tool does some formatting, (and therefore is based on your client machine spec) - then you may feel a bigger elapsed time.

    About - 'why' a tablespace when an index - with a better understanding of your original question, I'm not saying, it will do a tablescan, but it may , based on staistics on this table and others you join with.
    And the 10 minute elapsed means, the optimizer is joining tables before filtering this table. Have you done runstats with distribution for this column?

    Sathyaram
    Visit the new-look IDUG Website , register to gain access to the excellent content.

  5. #5
    Join Date
    Feb 2004
    Posts
    21
    I will do runstats as you suggested. Hopefully that will help.

    But goodness me, I'm still quite bewildered as to how changing one condition makes so much difference. As you said, the slower query applies the conditions only after the joins. I still can't quite grasp why this is so, when all else remained the same. And quite frankly, I would have thought that this query would be faster, due to the fact that the condition was direct (=), rather than (< or >).

  6. #6
    Join Date
    May 2003
    Location
    USA
    Posts
    5,737
    I would try a several things:

    1. Convert the inner join syntax to just regular join predicates (AND ....).

    2. Get rid of as many parenthesis as possible in the WHERE clause. I don't think you need any predicate until you get to IN.

    3. Try not to use RTRIM in the WHERE clause.
    M. A. Feldman
    IBM Certified DBA on DB2 for Linux, UNIX, and Windows
    IBM Certified DBA on DB2 for z/OS and OS/390

  7. #7
    Join Date
    Feb 2004
    Posts
    21
    Marcus_A,

    Thanks for your suggestions. Unfortunately the query is automatically generated, so I can't change much. Would you happen to have a clue why the 2 queries produce such a big difference in performance, with only the Month condition change?

    Anyone?

  8. #8
    Join Date
    Feb 2006
    Location
    Utrecht, Netherlands
    Posts
    16
    You can't change much... Is is possible to force usage of BETWEEN, instead of and AND condition? That should make a difference.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •