Results 1 to 3 of 3
  1. #1
    Join Date
    Oct 2010

    Unanswered: Using db2pd logs information


    If you could review my script below, and then critique how I intend to use the logs from the script, I would truly appreciate it -

    Script (dtl is toggled between 0 and 1 through a trap and kill combination)

    hst=`uname -a | awk '{ print $2 }'`

    while [ ${i} -le 720 ]
    x=`date '+%Y%m%d_%H%M%S'`

    db2pd -alldbpartitionnums -edus > edus_${x}_${hst}.log &
    db2pd -alldbpartitionnums -dbptnmem > dbptnmem_${x}_${hst}.log &
    db2pd -alldbpartitionnums -agents -db ${DBNM} > agt_${x}_${hst}.log &
    db2pd -alldbpartitionnums -bufferpools -db ${DBNM} > bfp_${x}_${hst}.log &
    db2pd -alldbpartitionnums -fcm > fcm_${x}_${hst}.log &
    db2pd -db ${DBNM} -logs > logs_${x}_${hst}.log &
    if [ ${dtl} -eq 1 ]; then
    db2pd -alldbpartitionnums -db ${DBNM} -tcbstats > tcbstats_${x}_${hst}.log &

    i=$(( i + 1 ))

    sleep 1200

    Here's my proposed plan to use the logs from the above script to perform diagnosis -

    Determining candidate workloads for termination: When information collected by the overall server monitoring script suggests a problem (based on thresholds defined), and decision is to terminate a workload, 2 successive snapshots of the edus log should help determination which thread and therefore application Id/workload (using the agents log) is consuming highest percentage CPU. The DB snapshot currently generated every half hour is NOT a good indicator for this information since it presents cumulative CPU information which might place a low % CPU consuming low priority workload ahead of a high % CPU high priority workload
    Determine issue with archival logging: Symptom: Applications report timeout-based commits
    Suggested Review: One item to check is the health of the logging process.
    Check if the Archival Status has reported failures - if so, work with storage to assess disk health in addition to reviewing the diagnostics for more detailed errors
    Compare 'Next Log to Archive' value to 'Current Log Number'. If the former is lower, the log path is about to fill up reducing application throughput and possibly causing application failures
    Determine sub-optimal access path:
    Symptom: Applications report increased execution time for several workloads
    Suggested Review:
    Examine the dbptnmem log. If a significant amount of memory is being consumed for FCM buffers in comparison with DB and APPL memory, consult the FCM logs to identify application Ids consuming the highest number of FCM buffers. Review these workloads for non-collocated joins and assess if MQTs or application changes could assist
    Examine at least 2 consecutive bufferpool logs. Determine bufferpool with the highest Data incremental reads and writes. Cross reference this with current workloads and first determine if the workload should even be allowed consumption of as much resources (possible, when workloads kicked off during the day continue well into night). If not, terminate the workload releasing resources for other workloads. If yes, determine if data reads seem to dominate - and subsequently check for missing indexes
    Selectively run tcbstats generating incremental snapshots to see which tables are being scanned the most, and then trace back to the workload (reported in tcbstats). Assess if indexes, or MQTs, or MDCs will help

    Thanks again!

    - G

  2. #2
    Join Date
    Oct 2010

  3. #3
    Join Date
    Oct 2010

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts