Unanswered: OS upgrade and Hardware change at the same time
Database version - V9.5 FP3a
Operating system AIX
Currently our database is running on AIX 5.3. Our AIX team tried to do a Hardware move and OS upgrade at the same time. It means the new Hardware will host the Operating system 6.1 . After doing this change, when I tried to bring up the database on the new hardware, the instance starts up but the database wouldn't activate and it throws error " An unexpected system error occured". When I see db2diag.log it complains that it couldn't get the correct LSN. Here is the error which I see
When we tried to backout the change, and moved back to the old hardware with OS 5.3, everything came back up well. At this point neither me nor AIX team could think of any reason why it caused that error. Now we are trying to change the OS back to 5.3 on the new hardware and move the db to the new hardware. I will keep you posted.
If someone could think of any technical reason why this error could have happend, it would greatly apprecited.
Looks like crash recovery failed due to a corrupted log... Was the database in a consistent state prior to the upgrade? You should be able to terminate all connections to the db, stop the instance, perform OS changes and then restart the instance/database. Do you see any errors in the AIX log from that time? Were the logs in the active log dir backed up/moved during the upgrade and then restored during backout? I'm just guessing here...
No errors are seen in the AIX logs. The active log directory was neither backed up nor moved during the upgrade. Intresting thing is that when we backed out the change, I mean when the SAN has been moved back to the older hardware things started working fine. Note that the operating system on the older hardware is 5.3. As we did both changes, moving the hardware and new operating system at the same time, I cannot narrow down the error.
I'm not sure why db2 was ok with this log file on the old hardware/OS and found a problem after the upgrade (unless db2 was reading a different set of logs). Crash recovery should not be required if the db is consistent (consistent is YES in db cfg) and it should be consistent if it's properly shutdown. So, you should be able to connect without db2 needing to process any log files.
Finally figured out the issue. Here is how our environment is setup. We have two VGs on the server. rootvg and QAvg. DB2 binaries are installed in rootvg. Rootvg has the filesystesm like /opt/IBM, /usr, /var, /home, /opt etc. And all the containers and db2 database files are present under QAvg. While moving the database to a new hardware, our AIX team takes a mksysb copy of rootvg on the old hardware and pushes it to the new hardaware couple of days prior to the change. As the contents in rootvg are mostly all binaries they remain unchanged for shorter perioed. During the change, they unmount all the filesysystems and lvs under QAVG from older host and move it on to the new host.
In this scenario, one of the filesystem where db2 directory is present. I mean the place where sqldbdir, log control files are present was under rootvg and not under QA VG. Someone might have designed it wrongly when the server was newly created. So on the new box, these log control files were three days old as the mksysb copy was three days old. When everything is moved to new box and when we tried to activate the database, the database manager was looking for a log file based on the LSN present in the log control file and that was three days back and this caused database not to start and to produce log file errors in the db2diag.log.
After realizing that this directory is present under rootvg, we copied over all the files of that filesystem from the old server to the new server and everything started working fine.