Results 1 to 5 of 5
  1. #1
    Join Date
    Dec 2009
    Posts
    43

    Unanswered: Logs not archiving to TSM [Solved]

    This is the environment etc.

    AIX 5.3.0.0

    DB21085I Instance "db2sp1" uses "64" bits and DB2 code release "SQL09015" with
    level identifier "01060107".
    Informational tokens are "DB2 v9.1.0.5", "s080512", "U815922", and Fix Pack "5".

    The situation was two databases out of four (each database was on a separate instance) were not backing up to TSM and not archiving their logs to TSM. The other two databases were fine.

    So many logs were being held on disk that space was becoming a critical issue, so if I didn't do something soon, the databases would stop.

    db2adutl showed that the two databases started failing their online backups and stopped archiving logs on the same date.


    I wanted the answer to not be recycling the instance, but after much investigation I think something was updated around the two separate instances so I decided to do it.

    I am not sure what was updated exactly unfortunately, this is a development server so gets a lot thrown at it. Possibly something java related. Whatever it was I think it broke the connection to TSM for the two instances.

    It is a clear lesson to me to insist that the developers talk to me more and I can schedule in this sort of maintenance.


    Here is an excerpt from the db2diag.log pre db2stop.

    2010-09-02-10.43.22.498325+060 E9993752A385 LEVEL: Warning
    PID : 2330726 TID : 1 PROC : db2logmgr (SP1) 0
    INSTANCE: db2sp1 NODE : 000
    FUNCTION: DB2 UDB, data protection services, sqlpgArchiveLogFile, probe:3150
    MESSAGE : ADM1848W Failed archive for log file "S0000065.LOG" to "TSM chain 0"
    from "/db2/SP1/log_dir/NODE0000/".

    2010-09-02-10.43.22.500700+060 I9994138A379 LEVEL: Error
    PID : 2330726 TID : 1 PROC : db2logmgr (SP1) 0
    INSTANCE: db2sp1 NODE : 000
    FUNCTION: DB2 UDB, data protection services, sqlpgArchiveLogFile, probe:3160
    MESSAGE : Failed to archive log file S0000065.LOG to TSM chain 0 from
    /db2/SP1/log_dir/NODE0000/ with rc = 11.

    2010-09-02-10.43.22.503264+060 I9994518A378 LEVEL: Warning
    PID : 2330726 TID : 1 PROC : db2logmgr (SP1) 0
    INSTANCE: db2sp1 NODE : 000
    FUNCTION: DB2 UDB, data protection services, sqlpgRetryFailedArchive, probe:4780
    MESSAGE : Still unable to archive log file 65 due to rc 11 for LOGARCHMETH1
    using method 2 and target .

    Moved the db2diag.log to db2diag.log.20100902 so a new one would be started and issued a db2start.

    db2start noted in the log, but that's it.

    Issued a...

    db2 backup db <dbname> online use TSM

    ...and at this point, the logs start getting archived, even before the backup is acknowledged in the log. (An example of this is at the bottom of the post).

    Having run this experiment again on another database, I see that it is at the point it must make the connection to TSM for the backup that one or more logs start getting archived, before the log has the db2 backup detailed in it.


    Firstly, I hope this helps someone else as I have failed to find something like this when googling the problem.

    Secondly, if anyone can shed some more light on what happened here, I would love to hear it! As ever thanks for your time!!


    2010-09-02-12.51.00.968728+060 E10003341A1033 LEVEL: Event
    PID : 1220666 TID : 1 PROC : db2star2
    INSTANCE: db2sp1 NODE : 000
    FUNCTION: DB2 UDB, base sys utilities, DB2StartMain, probe:911
    MESSAGE : ADM7513W Database manager has started.
    START : DB2 DBM
    DATA #1 : Build Level, 152 bytes
    Instance "db2sp1" uses "64" bits and DB2 code release "SQL09015"
    with level identifier "01060107".
    Informational tokens are "DB2 v9.1.0.5", "s080512", "U815922", Fix Pack "5".
    DATA #2 : System Info, 224 bytes
    System: AIX swosbxsap0a 3 5 00C7675C4C00
    CPU: total:32 online:4 Threading degree per core:2
    Physical Memory(MB): total:32768 free:2617
    Virtual Memory(MB): total:83840 free:41197
    Swap Memory(MB): total:51072 free:38580
    Kernel Params: msgMaxMessageSize:4194304 msgMaxQueueSize:4194304
    shmMax:68719476736 shmMin:1 shmIDs:131072
    shmSegments:68719476736 semIDs:131072 semNumPerID:65535
    semOps:1024 semMaxVal:32767 semAdjustOnExit:16384

    2010-09-02-12.52.21.032510+060 I10004375A312 LEVEL: Warning
    PID : 2408470 TID : 1 PROC : db2logmgr (SP1) 0
    INSTANCE: db2sp1 NODE : 000
    FUNCTION: DB2 UDB, data protection services, sqlpgArchiveLogFile, probe:3108
    MESSAGE : Started archive for log file S0000065.LOG.

    2010-09-02-12.52.21.100246+060 I10004688A368 LEVEL: Warning
    PID : 2519116 TID : 1 PROC : db2agent (SP1) 0
    INSTANCE: db2sp1 NODE : 000
    APPHDL : 0-7 APPID: *LOCAL.db2sp1.100902115219
    AUTHID : DB2SP1
    FUNCTION: DB2 UDB, relation data serv, sqlrr_db_init, probe:100
    MESSAGE : DB2_IMPLICIT_UNICODE enabled

    2010-09-02-12.52.21.538786+060 E10005057A410 LEVEL: Event
    PID : 2498776 TID : 1 PROC : db2stmm (SP1) 0
    INSTANCE: db2sp1 NODE : 000 DB : SP1
    APPHDL : 0-8 APPID: *LOCAL.DB2.100902115221
    AUTHID : DB2SP1
    FUNCTION: DB2 UDB, Self tuning memory manager, stmmLog, probe:1009
    DATA #1 : <preformatted>
    Starting STMM log from file number 613

    2010-09-02-12.52.21.590463+060 E10005468A392 LEVEL: Info
    PID : 2519116 TID : 1 PROC : db2agent (SP1) 0
    INSTANCE: db2sp1 NODE : 000 DB : SP1
    APPHDL : 0-7 APPID: *LOCAL.db2sp1.100902115219
    AUTHID : DB2SP1
    FUNCTION: DB2 UDB, database utilities, sqlubcka, probe:307
    MESSAGE : Performing SMS LOB backups with an S lock

    2010-09-02-12.52.21.613008+060 E10005861A393 LEVEL: Info
    PID : 2519116 TID : 1 PROC : db2agent (SP1) 0
    INSTANCE: db2sp1 NODE : 000 DB : SP1
    APPHDL : 0-7 APPID: *LOCAL.db2sp1.100902115219
    AUTHID : DB2SP1
    FUNCTION: DB2 UDB, database utilities, sqlubSetupJobControl, probe:1410
    MESSAGE : Starting an online db backup.

    2010-09-02-12.52.23.570041+060 I10006255A372 LEVEL: Warning
    PID : 2408470 TID : 1 PROC : db2logmgr (SP1) 0
    INSTANCE: db2sp1 NODE : 000
    FUNCTION: DB2 UDB, data protection services, sqlpgArchiveLogFile, probe:3170
    MESSAGE : Completed archive for log file S0000065.LOG to TSM chain 0 from
    /db2/SP1/log_dir/NODE0000/.

  2. #2
    Join Date
    Jun 2003
    Location
    Toronto, Canada
    Posts
    5,516
    Provided Answers: 1
    Code:
    2010-09-02-10.43.22.500700+060 I9994138A379 LEVEL: Error
    PID : 2330726 TID : 1 PROC : db2logmgr (SP1) 0
    INSTANCE: db2sp1 NODE : 000
    FUNCTION: DB2 UDB, data protection services, sqlpgArchiveLogFile, probe:3160
    MESSAGE : Failed to archive log file S0000065.LOG to TSM chain 0 from
    /db2/SP1/log_dir/NODE0000/ with rc = 11.
    TSM error code 11 means that there's no space on the destination media.

    Code:
    #define DSM_RS_ABORT_NO_STO_SPACE_SKIP       11
    It is a clear lesson to me to insist that the developers talk to me more and I can schedule in this sort of maintenance.
    Instead of talking to the developers, who are unable to mess with the TSM configuration (unless, of course, they have DBADM or SYSADM privileges, in which case you need to talk to yourself), you should work with the TSM admin to increase storage space or change retention policies.

  3. #3
    Join Date
    Dec 2009
    Posts
    43
    Quote Originally Posted by n_i View Post
    TSM error code 11 means that there's no space on the destination media.
    I was about to ask you to point me in the direction of how you know this, but googling "DSM_RS_ABORT_NO_STO_SPACE_SKIP" got me everything I need.

    Thank you!


    Don't you think it is odd that is a space issue? Why would any logs get archived to TSM after a restart and online backup? Is it overwriting something on the TSM storage?

    Alternatively, some people on the web with similar issues seem to think it may be a configuration issue that has caused this. Would you conisder this a posibility?

  4. #4
    Join Date
    Nov 2010
    Posts
    1
    Quote Originally Posted by hazy_dba View Post
    Quote:
    Originally Posted by n_i
    TSM error code 11 means that there's no space on the destination media.
    <...>
    Don't you think it is odd that is a space issue? Why would any logs get archived to TSM after a restart and online backup? Is it overwriting something on the TSM storage?

    Alternatively, some people on the web with similar issues seem to think it may be a configuration issue that has caused this. Would you conisder this a posibility?
    The rc=11 isn't necessarily a TSM code, check out this Technote on IBM's Support site:
    IBM - Interpreting Vendor API return codes from db2diag.log messages

    Cheers!

    --
    Jeroen

  5. #5
    Join Date
    Dec 2009
    Posts
    43
    Thanks Jeroen, I will check that link out.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •