Results 1 to 9 of 9
  1. #1
    Join Date
    Mar 2011
    Posts
    20

    Unanswered: DB2 Performance V9.5

    Hi All

    I am a novice DB2 DBA . Our application DB (DB2 V9.5) is encountering performance issue. We noticed a few sort overflow ( above 3%) for a few queries. We have 24 GB Ram , 8 cpus. 4 test databases with STMM enabled.

    The data is huge in environment. Tables contain millions of records.
    No indexes were recommended by db2 advisor. Even a single additional filter on a non-index columns results in poor performance.
    Few queries COUNT(1) returns arithmetic overflow SQL0802N . On trying to use COUNT_BIG(1) no result is returned. It runs for ages.

    Could anyone please help if any configuration updates are needed.
    Please let me know if more details are needed for more clarity

    Below are the DBM and DB configurations

    db2inst1@c01z0090:~> db2 get dbm config | grep MEM

    Size of instance shared memory (4KB) (INSTANCE_MEMORY) = AUTOMATIC(5385068)
    db2inst1@c01z0090:~> db2 get dbm config | grep SHEAP
    Sort heap threshold (4KB) (SHEAPTHRES) = 0
    db2inst1@c01z0090:~> db2 get dbm config | grep INTRA
    Enable intra-partition parallelism (INTRA_PARALLEL) = YES



    Database Configuration for Database aruatdb

    Database configuration release level = 0x0c00
    Database release level = 0x0c00

    Database territory = US
    Database code page = 1208
    Database code set = UTF-8
    Database country/region code = 1
    Database collating sequence = IDENTITY
    Alternate collating sequence (ALT_COLLATE) =
    Number compatibility = OFF
    Varchar2 compatibility = OFF
    Database page size = 4096

    Dynamic SQL Query management (DYN_QUERY_MGMT) = DISABLE

    Discovery support for this database (DISCOVER_DB) = ENABLE

    Restrict access = NO
    Default query optimization class (DFT_QUERYOPT) = 3
    Degree of parallelism (DFT_DEGREE) = ANY
    Continue upon arithmetic exceptions (DFT_SQLMATHWARN) = NO
    Default refresh age (DFT_REFRESH_AGE) = 0
    Default maintained table types for opt (DFT_MTTB_TYPES) = SYSTEM
    Number of frequent values retained (NUM_FREQVALUES) = 10
    Number of quantiles retained (NUM_QUANTILES) = 20

    Decimal floating point rounding mode (DECFLT_ROUNDING) = ROUND_HALF_EVEN

    Backup pending = NO

    Database is consistent = NO
    Rollforward pending = NO
    Restore pending = NO

    Multi-page file allocation enabled = YES

    Log retain for recovery status = NO
    User exit for logging status = NO

    Self tuning memory (SELF_TUNING_MEM) = ON
    Size of database shared memory (4KB) (DATABASE_MEMORY) = AUTOMATIC(4009878)
    Database memory threshold (DB_MEM_THRESH) = 100
    Max storage for lock list (4KB) (LOCKLIST) = AUTOMATIC(256096)
    Percent. of lock lists per application (MAXLOCKS) = AUTOMATIC(98)
    Package cache size (4KB) (PCKCACHESZ) = AUTOMATIC(26653)
    Sort heap thres for shared sorts (4KB) (SHEAPTHRES_SHR) = AUTOMATIC(1637642)
    Sort list heap (4KB) (SORTHEAP) = AUTOMATIC(163764)

    Database heap (4KB) (DBHEAP) = AUTOMATIC(2444)
    Catalog cache size (4KB) (CATALOGCACHE_SZ) = 800
    Log buffer size (4KB) (LOGBUFSZ) = 256
    Utilities heap size (4KB) (UTIL_HEAP_SZ) = 199439
    Buffer pool size (pages) (BUFFPAGE) = 8000
    SQL statement heap (4KB) (STMTHEAP) = AUTOMATIC(4096)
    Default application heap (4KB) (APPLHEAPSZ) = AUTOMATIC(256)
    Application Memory Size (4KB) (APPL_MEMORY) = AUTOMATIC(40000)
    Statistics heap size (4KB) (STAT_HEAP_SZ) = AUTOMATIC(4384)

    Interval for checking deadlock (ms) (DLCHKTIME) = 10000
    Lock timeout (sec) (LOCKTIMEOUT) = -1

    Changed pages threshold (CHNGPGS_THRESH) = 80
    Number of asynchronous page cleaners (NUM_IOCLEANERS) = AUTOMATIC(7)
    Number of I/O servers (NUM_IOSERVERS) = AUTOMATIC(6)
    Index sort flag (INDEXSORT) = YES
    Sequential detect flag (SEQDETECT) = YES
    Default prefetch size (pages) (DFT_PREFETCH_SZ) = AUTOMATIC

    Track modified pages (TRACKMOD) = OFF

    Default number of containers = 1
    Default tablespace extentsize (pages) (DFT_EXTENT_SZ) = 32

    Max number of active applications (MAXAPPLS) = AUTOMATIC(182)
    Average number of active applications (AVG_APPLS) = AUTOMATIC(1)
    Max DB files open per application (MAXFILOP) = 61440

    Log file size (4KB) (LOGFILSIZ) = 40960
    Number of primary log files (LOGPRIMARY) = 50
    Number of secondary log files (LOGSECOND) = 200
    Changed path to log files (NEWLOGPATH) =
    Path to log files = /db/dbdata01/db9sr001/NODE0000/SQL00004/SQLOGDIR/
    Overflow log path (OVERFLOWLOGPATH) =
    Mirror log path (MIRRORLOGPATH) =
    First active log file =
    Block log on disk full (BLK_LOG_DSK_FUL) = NO
    Block non logged operations (BLOCKNONLOGGED) = NO
    Percent max primary log space by transaction (MAX_LOG) = 0
    Num. of active log files for 1 active UOW(NUM_LOG_SPAN) = 0

    Group commit count (MINCOMMIT) = 1
    Percent log file reclaimed before soft chckpt (SOFTMAX) = 520
    Log retain for recovery enabled (LOGRETAIN) = OFF
    User exit for logging enabled (USEREXIT) = OFF

    HADR database role = STANDARD
    HADR local host name (HADR_LOCAL_HOST) =
    HADR local service name (HADR_LOCAL_SVC) =
    HADR remote host name (HADR_REMOTE_HOST) =
    HADR remote service name (HADR_REMOTE_SVC) =
    HADR instance name of remote server (HADR_REMOTE_INST) =
    HADR timeout value (HADR_TIMEOUT) = 120
    HADR log write synchronization mode (HADR_SYNCMODE) = NEARSYNC
    HADR peer window duration (seconds) (HADR_PEER_WINDOW) = 0

    First log archive method (LOGARCHMETH1) = OFF
    Options for logarchmeth1 (LOGARCHOPT1) =
    Second log archive method (LOGARCHMETH2) = OFF
    Options for logarchmeth2 (LOGARCHOPT2) =
    Failover log archive path (FAILARCHPATH) =
    Number of log archive retries on error (NUMARCHRETRY) = 5
    Log archive retry Delay (secs) (ARCHRETRYDELAY) = 20
    Vendor options (VENDOROPT) =

    Auto restart enabled (AUTORESTART) = ON
    Index re-creation time and redo index build (INDEXREC) = SYSTEM (RESTART)
    Log pages during index build (LOGINDEXBUILD) = OFF
    Default number of loadrec sessions (DFT_LOADREC_SES) = 1
    Number of database backups to retain (NUM_DB_BACKUPS) = 12
    Recovery history retention (days) (REC_HIS_RETENTN) = 366
    Auto deletion of recovery objects (AUTO_DEL_REC_OBJ) = OFF

    TSM management class (TSM_MGMTCLASS) =
    TSM node name (TSM_NODENAME) =
    TSM owner (TSM_OWNER) =
    TSM password (TSM_PASSWORD) =

    Automatic maintenance (AUTO_MAINT) = ON
    Automatic database backup (AUTO_DB_BACKUP) = OFF
    Automatic table maintenance (AUTO_TBL_MAINT) = ON
    Automatic runstats (AUTO_RUNSTATS) = ON
    Automatic statement statistics (AUTO_STMT_STATS) = OFF
    Automatic statistics profiling (AUTO_STATS_PROF) = OFF
    Automatic profile updates (AUTO_PROF_UPD) = OFF
    Automatic reorganization (AUTO_REORG) = OFF

    Enable XML Character operations (ENABLE_XMLCHAR) = YES
    WLM Collection Interval (minutes) (WLM_COLLECT_INT) = 0
    [/CODE]

  2. #2
    Join Date
    Apr 2012
    Posts
    1,034
    Provided Answers: 18
    What is the database-size? (dont say 'huge', give real GB or TB figure).

    Looks like basic tuning is needed.
    What information do you get from running db2top ? (see bottlenecks, and
    typical metrics like bufferpool hit ratios etc).

    What information do you get from 'top' ?

    Are you using automatic-storage?

    If so, how many storage pools?
    If not, how many containers per tablespace?

    What is your I/O system: SAN or local disks?

    Do you have 4 test databases in a single db2-instance?

    How do you know *which* of the databases is consuming resources?

    You have *circular logging* (not recoverable databases) - is this sensible for your situation?

    Are you sure STMM is the right option if you have multiple instances (and or multiple databases in the same instance).

  3. #3
    Join Date
    Jan 2007
    Location
    Jena, Germany
    Posts
    2,721
    Quote Originally Posted by db2mor View Post
    What is the database-size? (dont say 'huge', give real GB or TB figure).
    Very good point. "Huge" means for me that you are in the 2-digit TB range. Everything below is "small" or "normal". You may have a different opinion, of course.
    Knut Stolze
    IBM DB2 Analytics Accelerator
    IBM Germany Research & Development

  4. #4
    Join Date
    Mar 2011
    Posts
    20
    Hi

    Thank you for the replies.

    The database size is approx 400GB.DB9SR001 is the instance. DB2INST1 is the id with which the databases are created (SYSADMIN group)


    Yes is enabled for automatic storage. We dont have any storage pools defined explicitly. Each tablespace is built up on single container and almost 90% of the total tables are partitioned.

    The containers are defined on local. It is not SAN

    It is test (UAT) environment and hence is not recovery enabled.

    All the databases are defined under single instance. Since performance is slow we have only either one or two databases mostly used at a time.

    We have been using STMM for almost 2 years. It was recommended and also we are experiencing the problems since the amount of data loaded is increased.( almost tripled)

    db2inst1@c01z0090:/db/dbdata01/uatdb> top
    top - 14:08:49 up 129 days, 20:29, 4 users, load average: 1.21, 0.99, 0..79
    Tasks: 160 total, 1 running, 158 sleeping, 0 stopped, 1 zombie
    Cpu(s): 0.7%us, 1.1%sy, 0.0%ni, 97.4%id, 0.0%wa, 0.2%hi, 0.4%si, 0.2%st
    Mem: 24133M total, 22125M used, 2008M free, 34M buffers
    Swap: 1015M total, 16M used, 999M free, 21511M cached

    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
    14238 root 16 0 62196 15m 5304 S 2 0.1 3298:04 dsmc
    19892 root 25 0 9840 6448 2020 S 1 0.0 0:05.46 iam_extract_ibm
    25229 db2inst1 16 0 2784 1408 1028 R 1 0.0 0:00.03 top
    14 root 10 -5 0 0 0 S 0 0.0 8:26.24 events/0
    16603 db9sr001 19 0 30.3g 10g 10g S 0 45.8 23220:50 db2sysc
    19682 root 15 0 30480 2880 2128 S 0 0.0 0:01.20 exe
    1 root 16 0 848 320 272 S 0 0.0 2:25.20 init
    2 root RT 0 0 0 0 S 0 0.0 0:02.03 migration/0
    3 root 34 19 0 0 0 S 0 0.0 2:50.56 ksoftirqd/0
    4 root RT 0 0 0 0 S 0 0.0 0:03.47 migration/1
    5 root 34 19 0 0 0 S 0 0.0 2:56.27 ksoftirqd/1
    6 root RT 0 0 0 0 S 0 0.0 0:02.11 migration/2


    above is the top information.

    CPU usage is normal generally . Only while executing queries involving joins on tables with large number of recs ( 15-20 millions ) , the I/O is 100%. We do notice 100% sort utilization as well.

    Please let me know if more details are needed

  5. #5
    Join Date
    Nov 2003
    Location
    Am Ende der Welt
    Posts
    12
    Maybe you have performance degradation because of both, configuration and database design problems. If you have the proper indexes, it doesn't matter how big a table is, you can access it efficiently, so I would try to identify poor performing queries at first but it doesn't mean that you don't have to tune database parameters as well.

    This query can help to identify queries that demand too many resources (cpu, i/o, etc):

    SNAPSHOT_TIMESTAMP,NUM_EXECUTIONS,NUM_COMPILATIONS ,PREP_TIME_WORST,PREP_TIME_BEST,INT_ROWS_DELETED,I NT_ROWS_INSERTED,INT_ROWS_UPDATED,ROWS_READ,ROWS_W RITTEN,STMT_SORTS,SORT_OVERFLOWS,TOTAL_SORT_TIME,P OOL_DATA_L_READS,POOL_DATA_P_READS,POOL_TEMP_DATA_ L_READS,POOL_TEMP_DATA_P_READS,POOL_INDEX_L_READS, POOL_INDEX_P_READS,POOL_TEMP_INDEX_L_READS,POOL_TE MP_INDEX_P_READS,POOL_XDA_L_READS,POOL_XDA_P_READS ,POOL_TEMP_XDA_L_READS,POOL_TEMP_XDA_P_READS,TOTAL _EXEC_TIME,TOTAL_EXEC_TIME_MS,TOTAL_USR_CPU_TIME,T OTAL_USR_CPU_TIME,TOTAL_SYS_CPU_TIME,TOTAL_SYS_CPU _TIME,STMT_TEXT,DBPARTITIONNUM,STATS_FABRICATE_TIM E,SYNC_RUNSTATS_TIME from TABLE(SYSPROC.SNAP_GET_DYN_SQL_V95('')) order by TOTAL_EXEC_TIME_MS desc

    Hope this helps,

    César

  6. #6
    Join Date
    Mar 2011
    Posts
    20
    Thanks Caesar for the query. Will check that query

    The DB2 advisor didnot recommend any indexes for the queries with poor performance. Also we cannot create a unidex for all the cols which a user could probably involve in the query ( select or filter criteria).

    Since the performance is slow for majority of queries, could it not be possible that the issue lies with the database configuration.

    Please let me know if any suggestions. Kindly excuse my ignorance as this is the first time am working on performance issues.

    Thank you

  7. #7
    Join Date
    Apr 2012
    Posts
    1,034
    Provided Answers: 18
    A 400GB database is a small database, but even they need tuning and configuring.

    When asked about output of 'top' and 'db2top', it is wrong to post one fragment, you must *study* the output collected over busy times, then interpret that data with the help of your sysadmin. From this you make a plan of what to change, and how to test the effectiveness of each change.


    You posted that "90% of tables are partitioned", do you mean DPF (physical partitioning - multiple nodes) or range-partitioned ?

    For frequent multi-table joins with multiple-column predicates, databases also need good column-group statistics. Do you have any *column-group* statistics explicitly collected? Read the on-line documentation for details.

    Keep in mind that indexes can deliver performance boosts for troubled queries of 1000% while adjusting database software configuration (not hardware) might deliver less than 40% boost.

    If you have even 2 of these 400GB databases concurrently active on a 24GB-RAM 8-core box then you will experience pain if they are poorly configured and badly indexed.

    You posted that you observe 100% I/O utilisation when the table-scanning is happening.

    Verify that your *main* (non temporary) data and index tablespaces have 'no file system caching' enabled. Read the on-line documentation for instructions.

    You posted that all storage is on local disk with automatic storage, but you did not post details. How many local disks? Are these presented to DB2 as a single file-system, or multiple file systems? What kind of striping is happening at operating system level?

    The way forward is to identify the top 10 worst performing queries, verify that these will be part of the everyday use of the application, and then concentrate on optimizing those queries by whatever means.
    If you lack the skills and experience to complete that task, then it may be extremely inefficient (from project management viewpoint) to try to proceed without adequate skills and experience. Get local help.

  8. #8
    Join Date
    Mar 2011
    Posts
    20
    Thanks again for the information

    The table is range partitioned . All the database objects are located on single file system it is not split across multiple disks.

    Regarding the top, I had tried to work with the sysadmin to identify if any abnormal behavior (IO, memory ,CPU) but was reported that it was fine.

    Paging was also OK.

    Auto runstats is enabled and hence timely statistics are collected. We also tried to run the runstats manually on all the tables with distribution detailed and indexes all; Each of the table contains atleast one index ( primary)

    All of our user tablespaces are enabled for file caching system

    As mentioned through db2top we noticed , high sort utilisation for queries with poor performance. During that IO, CPU utilization were also noticed to be high. We increased the sort heap of the database by 20% of earlier , however that could not eliminate the sort overflows.

    Sort heap thres for shared sorts (4KB) (SHEAPTHRES_SHR) = AUTOMATIC(1637642)
    Sort list heap (4KB) (SORTHEAP) = AUTOMATIC(163764)

    Not sure by how much it needs to be increased and cannot gurantee even if further increase could help.

    Would try your other suggestions also. Thank you.

  9. #9
    Join Date
    May 2003
    Location
    USA
    Posts
    5,737
    Quote Originally Posted by saechira View Post
    Thanks again for the information

    The table is range partitioned . All the database objects are located on single file system it is not split across multiple disks.

    Regarding the top, I had tried to work with the sysadmin to identify if any abnormal behavior (IO, memory ,CPU) but was reported that it was fine.

    Paging was also OK.

    Auto runstats is enabled and hence timely statistics are collected. We also tried to run the runstats manually on all the tables with distribution detailed and indexes all; Each of the table contains atleast one index ( primary)

    All of our user tablespaces are enabled for file caching system

    As mentioned through db2top we noticed , high sort utilisation for queries with poor performance. During that IO, CPU utilization were also noticed to be high. We increased the sort heap of the database by 20% of earlier , however that could not eliminate the sort overflows.

    Sort heap thres for shared sorts (4KB) (SHEAPTHRES_SHR) = AUTOMATIC(1637642)
    Sort list heap (4KB) (SORTHEAP) = AUTOMATIC(163764)

    Not sure by how much it needs to be increased and cannot gurantee even if further increase could help.

    Would try your other suggestions also. Thank you.
    If you have large tables which often require table scans to process the query, you will almost always have high I/O and/or high CPU. I assume this is the case, since you said you ran db2 advisor and no indexes were suggested.

    File system caching should be turned off for all DMS tablespaces.
    M. A. Feldman
    IBM Certified DBA on DB2 for Linux, UNIX, and Windows
    IBM Certified DBA on DB2 for z/OS and OS/390

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •