Page 1 of 2 12 LastLast
Results 1 to 15 of 16
  1. #1
    Join Date
    Dec 2007
    Location
    Richmond, VA
    Posts
    1,328
    Provided Answers: 5

    Answered: poor query performance due to CODE PAGE???

    Hi all,
    DB2 v9.7.5
    I have the same database in 2 different instances. We'll call the one Inst1 and the other Inst2. On Inst1 I have this query that runs sub-second. On Inst2 it runs for 38 seconds. Same data in both instances, same structures(tables,indexes,etc...). The tables have been reorged and stats collected on both instances, same clustering index. I started looking at the DBM CFG and DB CFG. Everything is the same between the two, except for the code page. There were a couple of small differences, that I didn't think would affect performance, but I changed them on Inst2 to match Inst1. Inst1 is 819(latin1) and Inst2 is 1208(UTF-8).
    Has anyone else ever seen similar? I wouldn't expect the code page to cause this type of difference.
    Dave

  2. Best Answer
    Posted by dav1mo

    "IT IS CONFIRMED!!! Our huge response time issue was due to the NO FILE SYSTEM CACHING. subsecond with FILE SYSTEM CACHING and 38 seconds with NO FILE SYSTEM CACHING."


  3. #2
    Join Date
    Jun 2003
    Location
    Toronto, Canada
    Posts
    5,516
    Provided Answers: 1
    I'd probably start by looking at the two explain plans.
    ---
    "It does not work" is not a valid problem statement.

  4. #3
    Join Date
    Dec 2007
    Location
    Richmond, VA
    Posts
    1,328
    Provided Answers: 5
    Sorry bout that. The explains are the same as well. As I stated previously the only differences is the CCSIDs, one at 819 and the other at 1208.
    Dave

  5. #4
    Join Date
    Dec 2007
    Location
    Richmond, VA
    Posts
    1,328
    Provided Answers: 5
    Here are the actual differences between the two instances.

    Code:
    CFG FOR UDBQD053                                   CFG FOR UDBQD056
    Database code page                    = 819		    Database code page                    = 1208
    Database code set                       = ISO8859-1	    Database code set                       = UTF-8
    Database collating sequence         = UNIQUE	    Database collating sequence        = SYSTEM_819
    Multi-page file allocation enabled  = NO		    Multi-page file allocation enabled  = YES
    Dave

  6. #5
    Join Date
    Apr 2012
    Posts
    1,035
    Provided Answers: 18
    Are you measuring performance from CLP or from the application-side or from db2 monitoring?


    If the query returns a large result-set, is there a network difference from application-side to both instances?


    Is there a concurrency difference between the instances (e.g. is one more likely to wait on locks, or wait on i/o etc)?

    Is the hardware identical (#cores, cpu speed, ram available) between the two instances?

    You say explain plans are the same between the instances, does that include also the cardinality estimates for each operator? (db2caem will help if they differ, to show which operators have different expected/actual cardinalities).

    You can use the monitoring functions/views to compare the query wait times, or sort times, between the instances: that should explain the different timings.

  7. #6
    Join Date
    Dec 2007
    Location
    Richmond, VA
    Posts
    1,328
    Provided Answers: 5
    Both databases were on the same virtual server of a P7, so no difference in the server/network anything else, we actually have 5 instances on this one server. All had different data, but due to this issue we imported data from the one that was not experiencing the slowness to the one that was, so that we could be sure that compare an apple to an apple. Yes, the explains were identical. Using the same san drives as the server has 5 volumes defined and all 5 instances on this server use them all. We are now in the process of changing the one problematic instance to the same CCSID as the others. Will be a bit of a pain, due to having to export all data/then import it back in once the database is redefined. As stated previously there are only FOUR(4) differences between the two instances.
    Dave

  8. #7
    Join Date
    Apr 2012
    Posts
    1,035
    Provided Answers: 18
    Is this P7 Linux or P7 AIX ?

    You are suggesting that the UTF-8(with unique collation) is the cause of the 38sec query-time (vs subsecond query time with ISO-8859-1/system-819 collation).

    I would double-check.


    First, I would check if a UTF-8 database with collation "system_819_xx" (where XX is your territory as long as territory is not DK, FI, IS, NO, or SE) exhibits the symptom.

    Next I would restore a full backup of the UTF-8 database to a different Lpar, and check query-time, and finally restore a full backup of the ISO-8859 database to the lpar that currently hosts the UTF-8 database and check query-time.

    I expect some performance difference due to different collating sequences, but not a 38x difference.

  9. #8
    Join Date
    Dec 2007
    Location
    Richmond, VA
    Posts
    1,328
    Provided Answers: 5
    Is this P7 Linux or P7 AIX ?
    AIX

    I would double-check.
    We have.
    Next I would restore a full backup of the UTF-8 database to a different Lpar, and check query-time, and finally restore a full backup of the ISO-8859 database to the lpar that currently hosts the UTF-8 database and check query-time.
    We have many times already. The problem was first reported to us from the users of the production environment. We did a redirected restore to our play server to see if we could reproduce the same performance differences. We did, very clearly. When we exported the data from the UTF-8 source where it was taking 38 seconds and imported it to the ISO8859-1(same query ran subsecond) we received very different response times. We then took the data from the ISO8859-1 source and moved that over to the UTF-8 instance and had the same much longer run times as with the original data. There is only 150 rows difference between the data on the two instances.
    Dave

  10. #9
    Join Date
    Jun 2003
    Location
    Toronto, Canada
    Posts
    5,516
    Provided Answers: 1
    Care to post the two execution plans generated with db2exfmt?
    ---
    "It does not work" is not a valid problem statement.

  11. #10
    Join Date
    Apr 2012
    Posts
    1,035
    Provided Answers: 18
    Did you eliminate a UTF-8 database with SYSTEM_819_XX collation ?

  12. #11
    Join Date
    Dec 2007
    Location
    Richmond, VA
    Posts
    1,328
    Provided Answers: 5
    we ill have an etr with IBM. Yet we think we found the issue this morning. Looks like it might not be our code page after all. Is starting to look like it is our tablespace definition. We are still testing a couple of items here to prove/disprove our theory, but so far, looks like our culprit is the parm NO FILE SYSTEM CACHING on the tablespace in the offending system. We have the same tablespace in other systems where we get much faster response times as FILE SYSTEM CACHING. Will post later after confirming the actual reason.
    Dave

  13. #12
    Join Date
    Dec 2007
    Location
    Richmond, VA
    Posts
    1,328
    Provided Answers: 5
    IT IS CONFIRMED!!! Our huge response time issue was due to the NO FILE SYSTEM CACHING. subsecond with FILE SYSTEM CACHING and 38 seconds with NO FILE SYSTEM CACHING.
    Dave

  14. #13
    Join Date
    May 2012
    Location
    Canberra, Australia
    Posts
    176
    Provided Answers: 6
    Dave, this is contrary to everything IBM is telling us. I would be interested in their explanation.

  15. #14
    Join Date
    Jun 2003
    Location
    Toronto, Canada
    Posts
    5,516
    Provided Answers: 1
    Quote Originally Posted by tafster View Post
    I would be interested in their explanation.
    It could easily be the case if they were reading non-inlined LOBs in their queries. This is why I hate performance-related questions without actual query texts or explain plans -- you keep guessing for a week or two, while you could have found the answer after the first glance.
    ---
    "It does not work" is not a valid problem statement.

  16. #15
    Join Date
    Apr 2012
    Posts
    1,035
    Provided Answers: 18
    Agree with n_i. The statements "Everything is the same between the two, except for the code page. There were a couple of small differences.." are only valid if the original poster gave sufficient facts about what exactly was compared (other than the db cfg, dbm cfg, registry etc). A suitable db2look (that included the tablespace details) comparison between the working and failing environments would also have shown this difference in tablespace configuration. But that is the luxury of hindsight.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •