If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > DB2 > Performance Issues with Java Prepared Statments and Timestamps

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 07-21-04, 21:22
Beegee Beegee is offline
Registered User
 
Join Date: Jul 2004
Posts: 5
Performance Issues with Java Prepared Statments and Timestamps

Hello everybody,

in our company's production environment, we are suffering from severe performance problems with our DB2 database. Our Setup is the following:

Fujitsu Siemens Server with 4 CPUs (2GHz Xeon), 8 GB RAM
DB2 Workgroup Edition 8.1.5
Windows 2000 Advanced Server.

While tuning our web application, which is written in Java, I noticed a strange issue with Prepared Statements: When we are using a Prepared Statement which has a Timestamp as a host variable, it takes significantly longer to execute the statement than without the host variable.

For illlustration, I wrote the example program below. The Statement without the host variable takes about 100 ms in a test environment to execute 10 times. (I filled the table with 15000 rows with test data). The other statement takes more than one second to execute 10 times.

Why is this so and what can be done???

Any help is appreciated, I'm despreate!!!

Greetings, Beegee.

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.Timestamp;
import java.util.Calendar;
import java.util.Properties;

public class Main {

public static void main(String[] args) {

try {
System.out.println("Connecting...");
Class.forName("com.ibm.db2.jcc.DB2Driver");
Connection con = DriverManager.getConnection("jdbc:db2://localhost:50000/test_tcp","db2admin","db2admin");

System.out.println("Done.");

// PreparedStatement ps = con.prepareStatement("select id from table1 where created_at < '2004-07-22-02.20.00.000000'");
PreparedStatement ps = con.prepareStatement("select * from table1 where created_at < ?");

Calendar c = Calendar.getInstance();
c.set(2004, 07, 22, 2, 20, 0);
long ts = c.getTimeInMillis();

long start = Calendar.getInstance().getTimeInMillis();

ps.setTimestamp(1, new Timestamp(ts));

for( int i = 0; i < 100; i++)
ps.execute();

long end = Calendar.getInstance().getTimeInMillis();

System.out.println("Execute lasted " + (end-start) + " ms.");
}
catch( Exception e ) {
e.printStackTrace();
}
}
}
Reply With Quote
  #2 (permalink)  
Old 07-21-04, 22:42
Marcus_A Marcus_A is offline
Registered User
 
Join Date: May 2003
Location: USA
Posts: 5,196
I know very little about Java, but I would try one of the following:

1. If all 10 inserts need to have the same timestamp, then issue the following SQL statement to retrieve it:

SELECT current timestamp from SYSCAT.DUMMY1

2. If each insert can have its own timestamp, then use "current timestamp" instead of the host variable, or define the column as NOT NULL WITH DEFAULT and do not insert that column (DB2 will put the current timestamp in the column when a row is inserted).
__________________
M. A. Feldman
IBM Certified DBA on DB2 for Linux, UNIX, and Windows
IBM Certified DBA on DB2 for z/OS and OS/390
Reply With Quote
  #3 (permalink)  
Old 07-22-04, 05:17
Beegee Beegee is offline
Registered User
 
Join Date: Jul 2004
Posts: 5
Hi Marcus,

thanks for your answer. If I get you right, you suggest not to use timestamps as host variables.

However, this is no valid solution for our data model. We use timestamps in a number of tables and users connect to the database through a web application.

Therefore, we must be flexible with parameters in statements issued by the web application, since we can't hardcode any possible combination of parameters into the application.

I don't want to emphasize the Java aspect to much, so, generally speaking, to avoid the prepare/execute model isn't a valid option either, because the application's persitence framework works with prepared statements exclusively.

Although I didn't try with ODBC or a stored procedure yet (anything that can do prepare/execute), I don't think this is a real Java issue, but more a matter of the way DB2 manages prepared statements with host variables (which we just can't avoid).

Again, thanks for your answer.

Rüdiger.
Reply With Quote
  #4 (permalink)  
Old 07-22-04, 09:14
n_i n_i is offline
:-)
 
Join Date: Jun 2003
Location: Toronto, Canada
Posts: 4,449
Looks like optimizer tricks to me. When you supply a literal timestamp the optimizer seems to be selecting a more efficient plan than in the case where the timestamp value is not known at the preparation time.

Try comparing the execution plans for both statements.
Reply With Quote
  #5 (permalink)  
Old 07-22-04, 09:51
Marcus_A Marcus_A is offline
Registered User
 
Join Date: May 2003
Location: USA
Posts: 5,196
No, I wasn't suggesting that you not use host variables for a timestamp on a query. I thought the problem was in retrieving the system date (I told you I didn't know much about Java). If your problem SQL statement is:

select * from table1 where created_at < ?

then in some cases DB2 will make certain assumptions about what percent of the table needs to be read to satisfy this query. If the value is actually hard-coded (instead of host variable), then DB2 has a better idea of the number of rows that might be returned.

I assume you have an index on "created_at"? If so, then DB2 might not use the index with a less than (<) predicate unless you give it some hints that the number of rows returned will be relatively small. If the number of rows returned is relatively large (for example, half the table), and the table is not clustered on "create-at", then DB2 may not use the index on this column becasue it faster to scan the whole table without the index.

First, make sure your table is reorged on a reasonable interval. If your table has a clustering index, then you should reorg on that index. You may have the wrong clustering index defined, but that is another matter.

Next, make sure you do comprehensive runstats on the table and key columns with distribution of key columns. This is very important.

For your SQL statement, you can try adding the optimize clause:

select * from table1 where created_at < ? optimize for 1 row

This will tell the optimizer that, even though you are using a less than predicate (<), the number or rows returned is small and that DB2 should use an index on that column (if there is one).

You can test the access paths with the visual explain to see how it will work, with and without the optimize clause (but make sure you do the reorg and runstats with distribution).

Let us know how it works out.
__________________
M. A. Feldman
IBM Certified DBA on DB2 for Linux, UNIX, and Windows
IBM Certified DBA on DB2 for z/OS and OS/390
Reply With Quote
  #6 (permalink)  
Old 07-22-04, 18:28
Beegee Beegee is offline
Registered User
 
Join Date: Jul 2004
Posts: 5
Smile Distribution on key columns rules!

Thanks for your most valuable insight . The distribution on key columns did the trick.

Although our production crew does runstats where necessary on a regularly basis, there was quite a cluster ratio on on of the concerned tables.
When I did the runstats in my test environment (after copying the data from the production database via the export/load utitlity), the performance gain was 80% and the gap between preparing the statement with and without host variables has completely vanished!

Greetings, Beegee
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On