Results 1 to 11 of 11
  1. #1
    Join Date
    Jun 2005
    Posts
    8

    Question Unanswered: Performance question in DB2

    Hi, guys.

    In an effort to improve the performance of my queries, I was thinking of doing the following, but I need your advice on whether that would actually do the trick.

    I have a large table of data. When I access the data most often I need certain chunks of data which contain from 20-200 rows. Very seldom do I need rows from different chunks.

    Would it help if I stored the rows that belong to the same chunk in sequence (that is, in sequence of their primary keys - IDs)? Or does DB2 deal with it on its own, and this effort would be useless?

    What other suggestions can you think of?

    Thank you.
    - Dan

  2. #2
    Join Date
    Aug 2001
    Location
    UK
    Posts
    4,650
    Post the table DDL (including indexes) and your query. What is the number of rows in your table ?
    Visit the new-look IDUG Website , register to gain access to the excellent content.

  3. #3
    Join Date
    Jun 2005
    Posts
    8
    Quote Originally Posted by sathyaram_s
    Post the table DDL (including indexes) and your query. What is the number of rows in your table ?
    Thanks for replying.

    The table currently has 1,200,000 rows, but it should grow on a weekly basis.
    The table has a primary key - ID, and some foreign keys.

    There are 2 main queries for this table:
    1) In its simple form:
    SELECT * FROM BRS WHERE BRS.ID in (?,?,?.... x n times).

    2) The second query gets information by providing a list of one of the foreign keys:
    SELECT * FROM BRS WHERE BRS.DRIVERCOMBOID in (?,?,? .. x m times) AND <some other small restrictions>

    Usually the "chunks" of data that I refer to, are rows that have same certain foreign keys, and I usually need all of the at once.

    The DDL output is pretty long - what section is needed from there?

    Thanks

  4. #4
    Join Date
    Nov 2005
    Location
    IL
    Posts
    557
    1.2 is not such a big table. How many columns are in a table? what are the key on this table? Why do you do "select *"? How long does your SQL runs now? Is there an ORDER or GROUP cluase in your SQL?
    --
    IBM Certified DBA on DB2 for Linux, UNIX, and Windows

    DB2 v9.7.0.6 os 6.1.0.0

  5. #5
    Join Date
    Jun 2005
    Posts
    8
    Quote Originally Posted by Cougar8000
    1.2 is not such a big table. How many columns are in a table? what are the key on this table? Why do you do "select *"? How long does your SQL runs now? Is there an ORDER or GROUP cluase in your SQL?
    Thanks for replying.

    I'll try to answer some of the questions:
    1) The table has 16 columns. I think all, or at least most of the columns, are selected.
    ID - primary key
    6 foreign keys (most important ones are indexed)
    2) Hard to estimate the time that it takes. I'll say about 80-90 seconds for 200 "chunks" of data (each chunk is about 10-20 rows on average). So - it's about 0.3 sec per chunk. (a lot of approximations - might be incorrect)
    3) No order or group.

    Thanks.

  6. #6
    Join Date
    Sep 2004
    Location
    Belgium
    Posts
    1,126
    Output from EXPLAIN could be useful to find answers on index use and ordering.

    To return to your initial question: storing rows in sequence might help, but only if DB2 (the optimizer) is aware of it. So you probably just want to run REORG and then RUNSTATS on your tablespace.

    The benifit over the current situation would then come from a tablespace scan instead of an index access; I would not expect more than a factor 2 time gain (at the very most!) in this case.
    --_Peter Vanroose,
    __IBM Certified Database Administrator, DB2 9 for z/OS
    __IBM Certified Application Developer
    __ABIS Training and Consulting
    __http://www.abis.be/

  7. #7
    Join Date
    Jun 2005
    Posts
    8
    Peter, thanks for your reply.

    Quote Originally Posted by Peter.Vanroose
    Output from EXPLAIN could be useful to find answers on index use and ordering.
    I've tried it - it gave me the same "access plan" with sequential ids, as with non-sequential. How would I know if it fetches in a sequential manner?

    Quote Originally Posted by Peter.Vanroose
    To return to your initial question: storing rows in sequence might help, but only if DB2 (the optimizer) is aware of it. So you probably just want to run REORG and then RUNSTATS on your tablespace.

    The benifit over the current situation would then come from a tablespace scan instead of an index access; I would not expect more than a factor 2 time gain (at the very most!) in this case.
    Do you mean that just be performing REORG internally organizes the table? How do I specify how to reorganize it? Does it use statistics?

    Thanks.

  8. #8
    Join Date
    Apr 2004
    Posts
    54
    I do not think a query like "SELECT * FROM BRS WHERE BRS.ID in (?,?,?.... x n times)" is the good one.
    I'd suggest make a global temporary table:
    DECLARE GLOBAL TEMPORARY TABLE SESSION.BRS (ID INT)
    ON COMMIT PRESERVE ROWS;
    CREATE UNIQUE INDEX IBRS ON (SESSION.BRS) ALLOW REVERSE SCANS;
    and join it with BRS.

  9. #9
    Join Date
    Jun 2005
    Posts
    8
    Quote Originally Posted by gardenman
    I do not think a query like "SELECT * FROM BRS WHERE BRS.ID in (?,?,?.... x n times)" is the good one.
    I'd suggest make a global temporary table:
    DECLARE GLOBAL TEMPORARY TABLE SESSION.BRS (ID INT)
    ON COMMIT PRESERVE ROWS;
    CREATE UNIQUE INDEX IBRS ON (SESSION.BRS) ALLOW REVERSE SCANS;
    and join it with BRS.
    Thanks, gardenman.

    I was actually using this query before, but it turned out to be somewhat slower. I didn't construct an index, though, figuring the overhead would be too great. I will try it again.

    How would you suggest population the temp table? The way I was doing this is by parsing a CSV string of IDs passed as an argument to a stored procedure. Is there a better way?

    Regards,
    Dima

  10. #10
    Join Date
    Mar 2004
    Posts
    46
    Dan,

    One option that you can explore is the use of MDC (multi-dimension clustering). You need to use the different foreign keys and the primary key (if applicable) as the different dimensions. Note however, that MDC creates a "block" for the various combinations - and there is a likelyhood of significant "wasted/unused" space. Using MDC will allow DB2 to "home-in" into the specific blocks that satisfy your query predicates.

    Another option is to to cluster the data across one of the columns/keys (you have to select the one with the most distinct/unique values) and have the other columns as include columns. That way, the DB2 can eliminate rows by index retrieval itself.

    Depending upon the application type, concurrency and other things you can experiment with indexes on your keys and setting the DEFAULT_DEGREE (of optimization) to ANY and the OPTIMIZATION_LEVEL setting (experiment with 5 and higher). The objective of this is to see if DB2 sees any benefits by the use of index ANDing/ORing and parallelism.

    Good Luck.

    Jayesh

  11. #11
    Join Date
    Sep 2004
    Location
    Belgium
    Posts
    1,126
    Quote Originally Posted by dimka
    Do you mean that just be performing REORG internally organizes the table? How do I specify how to reorganize it? Does it use statistics?
    Have a look at the manuals: http://publib.boulder.ibm.com/infoce...e/r0001966.htm
    (for DB2 V8 on Linux/Unix/Windows) or http://publibz.boulder.ibm.com/cgi-b...20030507154051 (for DB2 V7 on z/OS).
    --_Peter Vanroose,
    __IBM Certified Database Administrator, DB2 9 for z/OS
    __IBM Certified Application Developer
    __ABIS Training and Consulting
    __http://www.abis.be/

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •