Page 1 of 3 123 LastLast
Results 1 to 15 of 31
  1. #1
    Join Date
    Aug 2008
    Location
    Toronto, Canada
    Posts
    2,367

    Unanswered: db2 maintenance scripts - error checking

    I have a script that calls 2 other scripts: reorg, runstats and then executes db2rbind. All write to separate log files. I want to add error and possibly warning checking (check for $? = 4 or $? = 2). If some error/warning is detected, it will execute another script that writes to a log file monitored by ITM monitoring product.


    What's the best way to accomplish this? Do I have to add $? for every reorg/runstats command? Or could I grep the log files for != DB20000I or something similar?

    Does anyone have any examples of error checking for reorg/runstats/rebind maintenance scripts?

  2. #2
    Join Date
    Apr 2012
    Posts
    1,035
    Provided Answers: 18
    My preferred style is to always check the exit-code from CLP.
    It takes minimal effort and the benefits far outweigh the costs.
    I use a common function (whether inlined in the script or dotted in) to do the logging work (and separate common functions for fatal error handling , rollback etc).

    Examples (works for KSH and Bash):

    Code:
    db2 -v  "do something critically important for me"
    (( $? > 3 )) && die "Blab blah blah failed and The end is nigh"
    
    db2 -v "do something useful for mel"
    (( $? > 3 )) && logError "Blah blah blah failed"
    Note: use of "> 3" is deliberate. If your specific situation is sensitive to CLP exit value 1 or 2 or 3 then code accordingly.

    You can do the equivalent thing in Windows CMD scripting but the syntax is different and more ugly but works fine.

  3. #3
    Join Date
    Aug 2008
    Location
    Toronto, Canada
    Posts
    2,367
    Another team wrote a script that does the error logging (based on the parameters I provide) and they support the ITM monitoring product that sends email/page/etc... alert.

    I just need a way of detecting that some error/warning occurred and then call their script based on what that is. I think I have to add $? to every single reorg/runstats command or grep the log files for something. No shortcuts.

  4. #4
    Join Date
    Apr 2012
    Posts
    1,035
    Provided Answers: 18
    If you want to be thorough, there are no real shortcuts.
    If you want alerting *as soon as the problem happens* then check after each command.
    If you are not worried about the immediacy (ie. can tolerate much delayed alerting) then you can grep log files - though the principle of reacting to the first-failure at the time of failure is quite useful in enterprise environments.

  5. #5
    Join Date
    Aug 2001
    Location
    UK
    Posts
    4,650
    You don't have to do rc for every command.

    If you want to stop after an error, then use the "s" command line option
    db2 -tsvf file.sql
    stops at the first error

    If you want to continue and at the end want to check codes then you can use something like this

    http://www.dbforums.com/unix-shell-s...egant-way.html

    I normally use the simpler version [ ie grep ]

    For reorg, you can do



    Code:
    # ignores messages listed in the reorg.ignore and reorg.rc4 files and counts.
    # if count is > 0, then it is an error . Send error message to operator console 
    grep -vE "^$|DB20000I|reorg" reorg.out | grep -vf reorg.ignore | grep -cvf reorg.rc4

    Code:
    # ignores messages listed in the reorg.ignore and counts the messages in rerog.rc4.
    # if count is > 0, then it is a rc4 error . Send a warning message to operator console 
    # do this only if you get count 0 when ignoring rc4 messages also [ above ]  
    grep -vE "^$|DB20000I|reorg" reorg.out | grep -vf reorg.ignore | grep -cf reorg.rc4
    where reorg.ignore has codes that can be ignored
    eg. SQL2314W

    reorg.rc4 to have a list of sql/db2 codes will send out a warning message to the operators

    anything that is not in reorg.ignore or reorg.rc4 should be considered serious error and alerted.
    Visit the new-look IDUG Website , register to gain access to the excellent content.

  6. #6
    Join Date
    Aug 2008
    Location
    Toronto, Canada
    Posts
    2,367
    Thank you, db2mor and Satya.

    I want to try both methods - checking RC for every command and grepping the log file. First, I want to try checking RC.

    I have a script that generates reorg/runstats for all db's on all servers. Something like:

    reorg_generate.ksh:
    #! /usr/bin/ksh

    . $HOME/sqllib/db2profile

    for DB in `db2 list db directory | grep -ip indirect | grep "Database alias" | awk '{print $4}'`
    do
    db2 "connect to $DB" > /dev/null
    echo "date" >> $DB.reorg.ksh
    echo "db2 connect to $DB" >> $DB.reorg.ksh
    db2 -x "select 'db2 -v reorg table '|| rtrim(tabschema) || '.\' || '\"' || rtrim(tabname)|| '\' || '\"' ||' allow read access' from syscat.tables where type = 'T'" >> $DB.reorg.ksh
    db2 terminate > /dev/null
    echo "date" >> $DB.reorg.ksh
    done


    It generates:

    date
    db2 connect to LDAPDB2
    db2 -v reorg table SYSIBM.\"SYSKEYCOLUSE\" allow read access
    db2 -v reorg table SYSIBM.\"SYSCOLCHECKS\" allow read access
    db2 -v reorg table SYSIBM.\"SYSCOLDIST\" allow read access
    ....


    I want to add RC checking. Something like:

    RC=$?
    if [[ $RC = "2" ]]; then
    <call error logging script with warning level>
    elif [[ $RC > "3" ]]; then
    <call error logging script with error level>
    else exit 0
    fi


    Can you please suggest how to add this code to my reorg_generate.ksh so that RC is checked after every reorg command?

  7. #7
    Join Date
    Aug 2001
    Location
    UK
    Posts
    4,650
    Quote Originally Posted by db2girl View Post
    Thank you, db2mor and Satya.

    I want to try both methods - checking RC for every command and grepping the log file. First, I want to try checking RC.

    I have a script that generates reorg/runstats for all db's on all servers. Something like:

    reorg_generate.ksh:
    #! /usr/bin/ksh

    . $HOME/sqllib/db2profile

    for DB in `db2 list db directory | grep -ip indirect | grep "Database alias" | awk '{print $4}'`
    do
    db2 "connect to $DB" > /dev/null
    echo "date" >> $DB.reorg.ksh
    echo "db2 connect to $DB" >> $DB.reorg.ksh
    db2 -x "select 'db2 -v reorg table '|| rtrim(tabschema) || '.\' || '\"' || rtrim(tabname)|| '\' || '\"' ||' allow read access' from syscat.tables where type = 'T'" >> $DB.reorg.ksh
    db2 terminate > /dev/null
    echo "date" >> $DB.reorg.ksh
    done


    It generates:

    date
    db2 connect to LDAPDB2
    db2 -v reorg table SYSIBM.\"SYSKEYCOLUSE\" allow read access
    db2 -v reorg table SYSIBM.\"SYSCOLCHECKS\" allow read access
    db2 -v reorg table SYSIBM.\"SYSCOLDIST\" allow read access
    ....


    I want to add RC checking. Something like:

    RC=$?
    if [[ $RC = "2" ]]; then
    <call error logging script with warning level>
    elif [[ $RC > "3" ]]; then
    <call error logging script with error level>
    else exit 0
    fi


    Can you please suggest how to add this code to my reorg_generate.ksh so that RC is checked after every reorg command?
    You can take a different approach.

    I am considering one database only
    Code:
    # generate reorg commands
    db2 connect to $DB 
    db2 -x "select 'reorg table '|| rtrim(tabschema) || '.\' ||  '\"' || rtrim(tabname)|| '\' || '\"' ||' allow read access' from syscat.tables where type = 'T'" >> $DB.reorg.txt
    Code:
    #wrapper to execute the db2/sql commands in a file 
    cmdfile=$DB.reorg.txt
    cat ${cmdfile} | while read mycmd
    do 
       db2 -tvf ${mycmd} 
       RC=$?
       if [[ $RC = "2" ]]; then
           <call error logging script with warning level>
        elif [[ $RC > "3" ]]; then
            <call error logging script with error level>
        fi 
    done

    This will make it generic and easy to understand/maitain.
    you can use this "wrapper" for any set of db2 commands or sqls

    you can extend this to build restartable scripts too
    Last edited by sathyaram_s; 05-24-12 at 19:30.
    Visit the new-look IDUG Website , register to gain access to the excellent content.

  8. #8
    Join Date
    Apr 2012
    Posts
    1,035
    Provided Answers: 18
    If you want to use your current approach and also want to alert immediately on the first error then you could use (after customising) the generate_reorg.ksh that is below.
    The main disadvantage of your current approach is that is is serial (apart from lack of error checking).

    The approach of sathyaram_s requires either a complete stop on first error (if using db2 -s) or alteratively delays alerting until all the reorgs are finished (which could be quite a long time).

    The sample below uses a different style to yours, but is basically the same as your original It alerts (async) on each failure, and can continue processing on errors if that is what you wish.

    This sample also has the alerting asyncronous to the reorgs but if you prefer synchronous then remove the "&" after calling your_logging script - of course you will substitute the string 'your_logging' by whatever script and arguments you need.

    Do you have separate logic for reorg indexes?

    Code:
    #! /usr/bin/ksh
    
    thisProfile=$HOME/sqllib/db2profile
    
    set -u
    
    [[ ! -r ${thisProfile} ]] && print "ERROR: missing ${thisProfile} " && exit 9
    
    . ${thisProfile}
    
    
    for DB in $(db2 list db directory | \
                grep -ip indirect | \
                grep "Database alias" | \
                awk '{print $4}' )
    
    do
        genfile=${DB}.reorg.ksh
        [[ -r ${genfile} ]] && rm -f ${genfile}
        db2 -o- "connect to ${DB}"
        (( $? != 0 )) &&  print "ERROR: cannot connect to ${DB}" && exit 9
        print "#!/usr/bin/ksh\ndate\n" > ${genfile}
        (( $? != 0 )) &&  print "ERROR: cannot write ${genfile}" && exit 9
    
        print "db2 connect to ${DB} \n((\$? != 0 )) && your_logging 9 && exit 9" >> ${genfile}
        set -x
        db2 -x "select 'db2 -v reorg table '||\
                rtrim(tabschema) || '.\' || '\"' ||\
                 rtrim(tabname)|| '\' || '\"' ||\
                 ' allow read access' || x'0A' || \
                 'RC=$?; (( RC >= 2 )) && your_logging \$RC & '
                 from syscat.tables where type = 'T'" >> ${genfile}
        (( $? != 0 )) && print "ERROR: query fail on ${DB}" && exit 9
        db2 -o- terminate
        echo "date" >> ${genfile}
    done

  9. #9
    Join Date
    Aug 2001
    Location
    UK
    Posts
    4,650
    Personally, I wouldn't bother sending an alert for every reorg. It will overwhelm the console/alert system and human operators if any.

    A reorg failure does not cause any issue around availability etc.Failure "may" lead to noticeable performance degrade for certain tables only. You can treat those tables as special case and alert immediately.

    If you don't want to wait till the end, but you can live with regular alerts [ not for every failure, but at regular intervals, say, every minute ] then you can use the grep command model I have given above, to check the reorg logs every minute and alert on errors. You can cron the check script to run every minute. If you change your mind later, that you can wait for longer intervals, you can change the schedule.
    Visit the new-look IDUG Website , register to gain access to the excellent content.

  10. #10
    Join Date
    Aug 2008
    Location
    Toronto, Canada
    Posts
    2,367
    Thanks a lot, guys! I'll work with your scripts. For now, I want the script to continue executing on error and just to receive an alert if something doesn't work. Offline reorg will rebuild the indexes. I've seen problems where a reorg failure left indexes marked as bad and rebuilding them (during business hrs) took a long time for large tables. The db's I support are small so don't anticipate any serious problem, this is just in case (and to make it similar to what they have on the mainframe).

  11. #11
    Join Date
    Aug 2008
    Location
    Toronto, Canada
    Posts
    2,367
    db2mor, I got to play with your script today. Here is what I currently have:

    Code:
    ...
        db2 -x "select 'db2 -v reorg table '||\
                rtrim(tabschema) || '.\' || '\"' ||\
                 rtrim(tabname)|| '\' || '\"' ||\
                 ' allow read access' || x'0A' || \
                'RC=\$?;' || '(( RC >= 2 )) && \$itmlog \$team \$severity_error \$product Script \$script reorg failed rc \$RC & '
                 from syscat.tables where type = 'T'" >> ${genfile}
    ...


    It generates:
    Code:
    ...
    db2 -v reorg table SYSIBM.\"SYSTABLES\" allow read access
    RC=$?;(( RC >= 2 )) && $itmlog $team $severity_error $product Script $script reorg failed rc $RC &
    
    
    db2 -v reorg table SYSIBM.\"SYSCOLUMNS\" allow read access
    RC=$?;(( RC >= 2 )) && $itmlog $team $severity_error $product Script $script reorg failed rc $RC &
    ....


    It it's an error , I want to call $itmlog with $severity_error
    If it's a warning, I want to call $itmlog with $severity_warn


    Do I need to use if/elif or can I do something similar to what you coded with &&. Can I check if RC = 2 or if RC > 2 and then call $itmlog the way you coded it?



    I tried:

    Code:
    db2 -v reorg table SYSIBM.\"SYSTABLE\" allow read access
    RC=$?;(( RC = 2 )) && $itmlog $team $severity_warn $product Script $script reorg failed rc $RC &
    RC=$?;(( RC > 2 )) && $itmlog $team $severity_error $product Script $script reorg failed rc $RC &
    
    
    db2 -v reorg table SYSIBM.\"SYSCOLUMN\" allow read access
    RC=$?
    if [[ $RC = "2" ]]; then
    $itmlog $team $severity_warn $product Script $script reorg failed rc $RC &
    elif [[ $RC > "3" ]]; then
    $itmlog $team $severity_error $product Script $script reorg failed rc $RC &
    fi

    if/elif is fine but not the lines with &&.

  12. #12
    Join Date
    Aug 2008
    Location
    Toronto, Canada
    Posts
    2,367
    The log $itmlog script writes to has:


    05/30/12 16:14:57 xtmwxr002 UDB E DB2 Script WAREHOUS.reorg.ksh.TO_TEST reorg failed rc 4
    05/30/12 16:14:57 xtmwxr002 UDB W DB2 Script WAREHOUS.reorg.ksh.TO_TEST reorg failed rc 2
    05/30/12 16:14:57 xtmwxr002 UDB E DB2 Script WAREHOUS.reorg.ksh.TO_TEST reorg failed rc 4


    I have rc 2 in addition to rc4 for table #1. I know the way I coded is wrong, but not sure how to change it to check if RC = 2 or RC > 2 and then invoke the script with &&

  13. #13
    Join Date
    Apr 2012
    Posts
    1,035
    Provided Answers: 18
    Easier if you can simplify by having a wrapper-script
    for itm logging which takes a single argument ($RC)
    and that invokes the $itmlog with the relevant parameters (according to the value of RC)
    because they the itmlog arguments seem to be constants while this reorg script runs.

    Of course it's better if you also pass the tabschema.tabname that failed but that's something you can add later when you get the basic thing working.

    Then in generate_reorg.ksh you can just call this wrapper (when RC >=2) and pass it the RC like this:

    Code:
    RC=$? ; (( RC >=2 )) && itmlog_wrapper_reorg $RC

    ---

    Example script itmlog_wrapper_reorg.ksh



    Code:
    #!/usr/bin/ksh
    # Takes one argument which is the CLP exit code 
    # from reorg.
    # You can extend this later to have a second argument
    # which is the tabschema.tabname of the table that whose
    # reorg gave a warning or error
    #
    (( $# != 1 )) && print "\nERROR: missing argument RC\n" && exit 99
    
    clp_exit_code=$1
    
    itmlogScript=.... # set this to the path+script-name
    team=... # whatever
    product=... # whatever
    script=... # whatever
    severity_error=...# whatever
    severity_warning=... #whatever
    
    if (( clp_exit_code == 2 ))
    then
       $itmlogScript $team $severity_warning $product Script $script reorg warning $clp_exit_code
    else
       $itmlogScript $team $severity_error $product Script $script reorg failed $clp_exit_code
    fi
    hth

  14. #14
    Join Date
    Aug 2008
    Location
    Toronto, Canada
    Posts
    2,367
    db2mor, Thanks a lot for your help!

  15. #15
    Join Date
    Aug 2008
    Location
    Toronto, Canada
    Posts
    2,367
    How can I catch rebind errors? I don't think I can check RC from CLP. I can do something like:

    db2rbind test -l test.rbind all > rbind.out
    grep rbind.out for 'Errors'. If RC is 0, it means there are errors so call error logging script.


    Is there a better or easier way to do this?

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •