Results 1 to 13 of 13
  1. #1
    Join Date
    Feb 2012
    Posts
    23

    Compiler using different runstats at prepared and direct statement

    Today I had a performance issue regarding one statement. The statement was executed through a Java program that is using JDBC by means of a prepared statement took about 70 minutes.
    When capturing the query and executing it manually in an CLP took only 40ms.
    After executing a runstats on the tables involved in the the query, the query executed via a prepared statement was fast again, taking only milliseconds.

    The question I now have is: Does anybody know how the runstats fixed the execution time of the query that was executed by the Java program by means of a prepared statement and why a direct execution of the same query (on the same statistics) did not have such a dramatic performance?

    Isn't DB2 using the same statistics when compiling a query, regardless if it is executed directly or by means of a prepared statement?

    The DB2 database where this occured was restored on a virtual VMware test server, running Windows Server 2008. DB2 LUW AESE version 9.5 fixpack 5.

  2. #2
    Join Date
    Jan 2003
    Posts
    4,130
    runstats fixed the statistics. That is what fixed the execution. When you prepare a statement in JDBC, it looks something like "select * from table1 where col1 = ?". DB2 has to make assumptions on what the value will be. It uses the statistics to do this and generates a plan. If the distribution is skewed, then it might make the wrong access plan.

    Andy

  3. #3
    Join Date
    Jan 2009
    Location
    Zoetermeer, Holland
    Posts
    746
    This smells like "optimization level" (http://www.dbforums.com/db2/1614820-...statement.html).
    Can be set as default, during bind (!! check that!!) in a JDBC connect string and so on.
    Somewhere between " too small" and " too large" lies the size that is just right.
    - Scott Hayes

  4. #4
    Join Date
    Nov 2011
    Posts
    315
    Do you mind to publish your java code that prepare and executes the query?
    Did you use bind parameters?

  5. #5
    Join Date
    Feb 2012
    Posts
    23
    @Andy I know how runstats work, but still the strange thing is that when the same query is executed it executes fast when the query is issued directly, and executes dramatically poor when executed via a prepare phase and the execute it. It's the same query executing on the same runstats.

    @Dick I think we only use the overall setting of the optimization level, meaning I don't think the Java code is setting the optimazation level explicitly. I will check the code tomorrow to be absolutely sure.

    @fengsun2 I will check the code tomorrow and don't think it is so specific to my company that I can't publish it. If it's ok to publish it I will do so tomorrow when I'm back in the office. And to answer directly; yes we do use bind variables / parameter markers. So the query is put in the dynamic package cache once compiled. I also flushed the package cache to exclude any influence from the query being in the cache when comparing both execution ways. Behaviour is still the same; excuted directly from CLP gives good performance, via prepared statement dramatic.

    Thanks for all your comments!
    I'll have a look into the code tomorrow and when it not compromising to my company will publish it tomorrow. Maybe one of you then knows the answer to this strange behaviour.

  6. #6
    Join Date
    Jun 2003
    Location
    Toronto, Canada
    Posts
    5,427
    See if this better explains what you're experiencing:

    Using the REOPT bind option with input variables in complex queries
    ---
    "It does not work" is not a valid problem statement.

  7. #7
    Join Date
    Feb 2012
    Posts
    23
    @Dick Optimisation level is not set in the Java code, so that can't be it. The DFT_QUERYOPT is set to the default 5.

    @fengsun2 See below a stripped part of our code. It's just straightforward..

    @n_i Thanks for your comment, I've looked into it but don't think it's such a complex query. See below a stripped version of the query.

    Java code snippet
    Code:
    protected List executeCommand() {
      
        final String sqlCmd = getSqlCommand();
        final StringBuilder prmStr = new StringBuilder(1024);
    
        try {
    
          con = ConnectionProvider.getConnection();
          ps = con.prepareStatement(sqlCmd);
    
          // Set the parameters
          for (int i = 0; i < sqlParams.size(); i++) {
            Object paramObj = sqlParams.get(i);
            ps.setObject(i + 1, paramObj);        
            prmStr.append(paramObj);
            if (i < sqlParams.size() - 1) {
              prmStr.append(", ");
            }
          }
    
          final List<Object> result;
    
          rs = ps.executeQuery();
          final ResultSetMetaData meta = rs.getMetaData();
          final int colCount = meta.getColumnCount();
    
          // Fetch the rows      
          .....            
          return result;
          
        } catch (SQLException e) {
          .....
        } finally {
          ConnectionProvider.releaseResources(rs, ps, con);
          rs = null;
          ps = null;
          con = null;
        }
     }
    The stripped query
    SELECT DISTINCT id, email
    FROM Table
    JOIN otherTable ON Table.id = otherTable.id
    JOIN ...
    JOIN ...
    JOIN ...
    JOIN ...
    JOIN ...
    JOIN ...
    JOIN ...
    JOIN ...
    JOIN ...
    JOIN ...
    WHERE anID = ? AND booleanValue = 'F'
    AND ...
    AND ...
    AND ...
    AND ...
    AND ...
    AND ( (id > ?) OR (id =?) )
    ORDER BY id ASC
    FETCH FIRST 51 ROWS ONLY FOR READ ONLY WITH UR

  8. #8
    Join Date
    Jan 2003
    Posts
    4,130
    When you issue the query "directly" do you use parameter markers? I suspect not. Then it is NOT the same query because you used literals. DB2 can make a different access plan for this. Your query is doing ID > ? or ID = ?. Without actual values, DB2 will make assumptions on the values passed. The Greater-Than and the OR are almost certainly the culprits for having a bad access plan.

    Andy

  9. #9
    Join Date
    Feb 2012
    Posts
    23
    You could have a point there Andy! I did use literals when issuing the query directly. I'll go and experiment with the differences between literals and parameter markers regarding this specific query when I am back at the office again. I am enjoying a nice Christmas holiday starting from today and will be back at the office at the 7th of January.
    I am a little surprised that the differences in executing time can be so extremely indifferent, but really think you hit the right spot. Thanks for your reply! I'll come back on it after my holliday. I wish you a Merry Christmas and a happy New Year!

  10. #10
    Join Date
    Jan 2003
    Posts
    4,130
    You can use the "Command Editor" to show you the access plans for both the query with parameter markers and with literals.

    Andy

  11. #11
    Join Date
    Nov 2011
    Posts
    315
    Before you runstats , the query executed through a Java program is reusing the access plan hold in the package cache ( even you prepare the query explicity )。
    when you capture the query and executed it mannully in an CLP , it is a new query to db2 because you used literals string instead of the parameter markers, so db2 compile it again and generated a new access plan.
    when you issued runstats on the tables , it will soft invalidate the access plan , so when you exected the java program again , db2 will recompile it and generated a new plan .....
    So that is the reason for "runstats fixed the execution time of the query that was executed by the Java program".

  12. #12
    Join Date
    Aug 2008
    Location
    Toronto, Canada
    Posts
    2,361
    Quote Originally Posted by fengsun2 View Post
    when you issued runstats on the tables , it will soft invalidate the access plan , so when you exected the java program again , db2 will recompile it and generated a new plan .....
    So that is the reason for "runstats fixed the execution time of the query that was executed by the Java program".

    Baddus mentioned:
    "So the query is put in the dynamic package cache once compiled. I also flushed the package cache to exclude any influence from the query being in the cache when comparing both execution ways. Behaviour is still the same; excuted directly from CLP gives good performance, via prepared statement dramatic."

    So, as per above, the query was recompiled by db2 but using the old statistics. I think optimizer doesn't use distribution statistics when the query contains parameter markers so the access plan can be different. But such a big improvement after collecting stats - from 70min to 40ms??

  13. #13
    Join Date
    Nov 2011
    Posts
    315
    yes, may be you are right,
    but , we noticed that " The statement was executed through a Java program that is using JDBC by means of a prepared statement took about 70 minutes.When capturing the query and executing it manually in an CLP took only 40ms."
    that means ,before updating the statistics , just recompiling the query will get the good plan.
    so i doubt that "flush the package cache" statement sometimes will not invalidate the plan when it is being in use ( I am not sure about that ).

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •