If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > DB2 > Logs not archiving to TSM [Solved]

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 09-02-10, 11:40
hazy_dba hazy_dba is offline
Registered User
 
Join Date: Dec 2009
Posts: 40
Logs not archiving to TSM [Solved]

This is the environment etc.

Quote:
AIX 5.3.0.0

DB21085I Instance "db2sp1" uses "64" bits and DB2 code release "SQL09015" with
level identifier "01060107".
Informational tokens are "DB2 v9.1.0.5", "s080512", "U815922", and Fix Pack "5".

The situation was two databases out of four (each database was on a separate instance) were not backing up to TSM and not archiving their logs to TSM. The other two databases were fine.

So many logs were being held on disk that space was becoming a critical issue, so if I didn't do something soon, the databases would stop.

db2adutl showed that the two databases started failing their online backups and stopped archiving logs on the same date.


I wanted the answer to not be recycling the instance, but after much investigation I think something was updated around the two separate instances so I decided to do it.

I am not sure what was updated exactly unfortunately, this is a development server so gets a lot thrown at it. Possibly something java related. Whatever it was I think it broke the connection to TSM for the two instances.

It is a clear lesson to me to insist that the developers talk to me more and I can schedule in this sort of maintenance.


Here is an excerpt from the db2diag.log pre db2stop.

Quote:
2010-09-02-10.43.22.498325+060 E9993752A385 LEVEL: Warning
PID : 2330726 TID : 1 PROC : db2logmgr (SP1) 0
INSTANCE: db2sp1 NODE : 000
FUNCTION: DB2 UDB, data protection services, sqlpgArchiveLogFile, probe:3150
MESSAGE : ADM1848W Failed archive for log file "S0000065.LOG" to "TSM chain 0"
from "/db2/SP1/log_dir/NODE0000/".

2010-09-02-10.43.22.500700+060 I9994138A379 LEVEL: Error
PID : 2330726 TID : 1 PROC : db2logmgr (SP1) 0
INSTANCE: db2sp1 NODE : 000
FUNCTION: DB2 UDB, data protection services, sqlpgArchiveLogFile, probe:3160
MESSAGE : Failed to archive log file S0000065.LOG to TSM chain 0 from
/db2/SP1/log_dir/NODE0000/ with rc = 11.

2010-09-02-10.43.22.503264+060 I9994518A378 LEVEL: Warning
PID : 2330726 TID : 1 PROC : db2logmgr (SP1) 0
INSTANCE: db2sp1 NODE : 000
FUNCTION: DB2 UDB, data protection services, sqlpgRetryFailedArchive, probe:4780
MESSAGE : Still unable to archive log file 65 due to rc 11 for LOGARCHMETH1
using method 2 and target .

Moved the db2diag.log to db2diag.log.20100902 so a new one would be started and issued a db2start.

db2start noted in the log, but that's it.

Issued a...

db2 backup db <dbname> online use TSM

...and at this point, the logs start getting archived, even before the backup is acknowledged in the log. (An example of this is at the bottom of the post).

Having run this experiment again on another database, I see that it is at the point it must make the connection to TSM for the backup that one or more logs start getting archived, before the log has the db2 backup detailed in it.


Firstly, I hope this helps someone else as I have failed to find something like this when googling the problem.

Secondly, if anyone can shed some more light on what happened here, I would love to hear it! As ever thanks for your time!!


Quote:
2010-09-02-12.51.00.968728+060 E10003341A1033 LEVEL: Event
PID : 1220666 TID : 1 PROC : db2star2
INSTANCE: db2sp1 NODE : 000
FUNCTION: DB2 UDB, base sys utilities, DB2StartMain, probe:911
MESSAGE : ADM7513W Database manager has started.
START : DB2 DBM
DATA #1 : Build Level, 152 bytes
Instance "db2sp1" uses "64" bits and DB2 code release "SQL09015"
with level identifier "01060107".
Informational tokens are "DB2 v9.1.0.5", "s080512", "U815922", Fix Pack "5".
DATA #2 : System Info, 224 bytes
System: AIX swosbxsap0a 3 5 00C7675C4C00
CPU: total:32 online:4 Threading degree per core:2
Physical Memory(MB): total:32768 free:2617
Virtual Memory(MB): total:83840 free:41197
Swap Memory(MB): total:51072 free:38580
Kernel Params: msgMaxMessageSize:4194304 msgMaxQueueSize:4194304
shmMax:68719476736 shmMin:1 shmIDs:131072
shmSegments:68719476736 semIDs:131072 semNumPerID:65535
semOps:1024 semMaxVal:32767 semAdjustOnExit:16384

2010-09-02-12.52.21.032510+060 I10004375A312 LEVEL: Warning
PID : 2408470 TID : 1 PROC : db2logmgr (SP1) 0
INSTANCE: db2sp1 NODE : 000
FUNCTION: DB2 UDB, data protection services, sqlpgArchiveLogFile, probe:3108
MESSAGE : Started archive for log file S0000065.LOG.

2010-09-02-12.52.21.100246+060 I10004688A368 LEVEL: Warning
PID : 2519116 TID : 1 PROC : db2agent (SP1) 0
INSTANCE: db2sp1 NODE : 000
APPHDL : 0-7 APPID: *LOCAL.db2sp1.100902115219
AUTHID : DB2SP1
FUNCTION: DB2 UDB, relation data serv, sqlrr_db_init, probe:100
MESSAGE : DB2_IMPLICIT_UNICODE enabled

2010-09-02-12.52.21.538786+060 E10005057A410 LEVEL: Event
PID : 2498776 TID : 1 PROC : db2stmm (SP1) 0
INSTANCE: db2sp1 NODE : 000 DB : SP1
APPHDL : 0-8 APPID: *LOCAL.DB2.100902115221
AUTHID : DB2SP1
FUNCTION: DB2 UDB, Self tuning memory manager, stmmLog, probe:1009
DATA #1 : <preformatted>
Starting STMM log from file number 613

2010-09-02-12.52.21.590463+060 E10005468A392 LEVEL: Info
PID : 2519116 TID : 1 PROC : db2agent (SP1) 0
INSTANCE: db2sp1 NODE : 000 DB : SP1
APPHDL : 0-7 APPID: *LOCAL.db2sp1.100902115219
AUTHID : DB2SP1
FUNCTION: DB2 UDB, database utilities, sqlubcka, probe:307
MESSAGE : Performing SMS LOB backups with an S lock

2010-09-02-12.52.21.613008+060 E10005861A393 LEVEL: Info
PID : 2519116 TID : 1 PROC : db2agent (SP1) 0
INSTANCE: db2sp1 NODE : 000 DB : SP1
APPHDL : 0-7 APPID: *LOCAL.db2sp1.100902115219
AUTHID : DB2SP1
FUNCTION: DB2 UDB, database utilities, sqlubSetupJobControl, probe:1410
MESSAGE : Starting an online db backup.

2010-09-02-12.52.23.570041+060 I10006255A372 LEVEL: Warning
PID : 2408470 TID : 1 PROC : db2logmgr (SP1) 0
INSTANCE: db2sp1 NODE : 000
FUNCTION: DB2 UDB, data protection services, sqlpgArchiveLogFile, probe:3170
MESSAGE : Completed archive for log file S0000065.LOG to TSM chain 0 from
/db2/SP1/log_dir/NODE0000/.
Reply With Quote
  #2 (permalink)  
Old 09-02-10, 12:46
n_i n_i is offline
:-)
 
Join Date: Jun 2003
Location: Toronto, Canada
Posts: 4,448
Code:
2010-09-02-10.43.22.500700+060 I9994138A379 LEVEL: Error
PID : 2330726 TID : 1 PROC : db2logmgr (SP1) 0
INSTANCE: db2sp1 NODE : 000
FUNCTION: DB2 UDB, data protection services, sqlpgArchiveLogFile, probe:3160
MESSAGE : Failed to archive log file S0000065.LOG to TSM chain 0 from
/db2/SP1/log_dir/NODE0000/ with rc = 11.
TSM error code 11 means that there's no space on the destination media.

Code:
#define DSM_RS_ABORT_NO_STO_SPACE_SKIP       11
Quote:
It is a clear lesson to me to insist that the developers talk to me more and I can schedule in this sort of maintenance.
Instead of talking to the developers, who are unable to mess with the TSM configuration (unless, of course, they have DBADM or SYSADM privileges, in which case you need to talk to yourself), you should work with the TSM admin to increase storage space or change retention policies.
Reply With Quote
  #3 (permalink)  
Old 09-03-10, 04:20
hazy_dba hazy_dba is offline
Registered User
 
Join Date: Dec 2009
Posts: 40
Quote:
Originally Posted by n_i View Post
TSM error code 11 means that there's no space on the destination media.
I was about to ask you to point me in the direction of how you know this, but googling "DSM_RS_ABORT_NO_STO_SPACE_SKIP" got me everything I need.

Thank you!


Don't you think it is odd that is a space issue? Why would any logs get archived to TSM after a restart and online backup? Is it overwriting something on the TSM storage?

Alternatively, some people on the web with similar issues seem to think it may be a configuration issue that has caused this. Would you conisder this a posibility?
Reply With Quote
  #4 (permalink)  
Old 11-17-10, 06:51
jvandenbroek jvandenbroek is offline
Registered User
 
Join Date: Nov 2010
Posts: 1
Quote:
Originally Posted by hazy_dba View Post
Quote:
Originally Posted by n_i
TSM error code 11 means that there's no space on the destination media.
<...>
Don't you think it is odd that is a space issue? Why would any logs get archived to TSM after a restart and online backup? Is it overwriting something on the TSM storage?

Alternatively, some people on the web with similar issues seem to think it may be a configuration issue that has caused this. Would you conisder this a posibility?
The rc=11 isn't necessarily a TSM code, check out this Technote on IBM's Support site:
IBM - Interpreting Vendor API return codes from db2diag.log messages

Cheers!

--
Jeroen
Reply With Quote
  #5 (permalink)  
Old 11-18-10, 04:04
hazy_dba hazy_dba is offline
Registered User
 
Join Date: Dec 2009
Posts: 40
Thanks Jeroen, I will check that link out.
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On