Results 1 to 5 of 5
  1. #1
    Join Date
    Jun 2009
    Posts
    2

    Unanswered: Might be Basic ?, but need help. Thanks

    New to DB2 world !

    Database : db2 in mainframe. all tables a,b,c,d have 50 Million records each .

    Query1 : select a.col1, a.col2, b.col1, b.col2, c.col1, c.col2, d.col1,d.col2
    from a, b, c, d
    where
    a.key = b.f_key
    and a.key = c.f_key
    and a.key = d.f_key
    and a. key between (1 and 1 million)

    Query2 : select a.col1, a.col2, b.col1, b.col2, c.col1, c.col2, d.col1,d.col2
    from a, b, c, d
    where
    a. key between (1 and 1 million)
    and a.key = b.f_key
    and a.key = c.f_key
    and a.key = d.f_key

    Query3 : select a.col1, a.col2, b.col1, b.col2, c.col1, c.col2, d.col1,d.col2
    from a, b, c, d
    where
    a. key between (1 and 1 million)
    and b.f_key between (1 and 1 million)
    and c.f_key between (1 and 1 million)
    and d.f_key between (1 and 1 million)
    and a.key = b.f_key
    and a.key = c.f_key
    and a.key = d.f_key

    We have to process one million records at each time . Whats the better way to query

    Will there be any difference in the query cost/plan for the above mentioned queries. Which is the better way to write the query considering indexes are available in the key columns.

    Help is really appreciated. Thanks in Advance

  2. #2
    Join Date
    Sep 2004
    Location
    Belgium
    Posts
    1,126
    No, there will be no difference. The optimizer will automatically add the 3 extra conditions (as of your Query 3); this is called transitive closure. The order in which you write the conditions, at least in this example, don't matter since the four "between" conditions are all applied on the individual tables (before the join) while the other three are join conditions (used during the join).
    The following formulation, again equivalent for the optimizer, expresses more closely what the sequence of execution will be:
    Code:
    SELECT a.col1, a.col2, b.col1, b.col2, c.col1, c.col2, d.col1, d.col2
    FROM   (SELECT a.col1, a.col2, a.key
            FROM   a WHERE a.key between 1 and 1000000) a
           INNER JOIN
           (SELECT b.col1, b.col2, b.f_key
            FROM   b WHERE b.f_key between 1 and 1000000) b
           ON a.key = b.f_key
           INNER JOIN
           (SELECT c.col1, c.col2, c.f_key
            FROM   c WHERE c.f_key between 1 and 1000000) c
           ON a.key = c.f_key
           INNER JOIN
           (SELECT d.col1, d.col2, d.f_key
            FROM   d WHERE d.f_key between 1 and 1000000) d
           ON a.key = d.f_key
    The three joins themselves may be implemented in different order (4x3x2=24 different choices) and different flavours (nested loop, merge scan, or hybrid: each time 3 choices), i.e., a total of 24x27=652 possibilities.
    Last edited by Peter.Vanroose; 06-02-09 at 15:13.
    --_Peter Vanroose,
    __IBM Certified Database Administrator, DB2 9 for z/OS
    __IBM Certified Application Developer
    __ABIS Training and Consulting
    __http://www.abis.be/

  3. #3
    Join Date
    Dec 2008
    Posts
    76
    There is no functional difference between the queries. The optimizer will deal with all the same way.

    (Peter and I answered at the same time)
    Last edited by rdutton; 06-02-09 at 15:09.

  4. #4
    Join Date
    Jun 2009
    Posts
    2

    Thanks a lot

    Thanks a lot for quick response

    Spl thanks to Peter for a detail explanation to understand much better!!

    I really Appreciate it

    Thanks
    sarahman

  5. #5
    Join Date
    May 2003
    Posts
    113
    in DB2 for z/OS(mainframe), there is an optimization method called TAR JOIN. which may be a good fit for your queries.

    By default, this STAR JOIN method is OFF, but can be turned on by a zPARM. Your DBA person can help on this.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •