Results 1 to 7 of 7
  1. #1
    Join Date
    Dec 2008
    Posts
    76

    Unanswered: HADR Setup Issues

    The situation:
    We are running DB2 v9.5 FP 2a on Solaris 10 (Sun T2000). We have created mirror copy installations on two servers following the step-by-step command line setup instructions in the Redbook "High Availability and Disaster Recovery Options for DB2 on Linux, UNIX, and Windows (August, 2008)". All seems to be fine up to and during the startup commands, but the systems never go into peer state.

    COMMANDS ISSUED ON THE STANDBY:
    $ db2 restore db hadrtest from /DB2/data/backup taken at 20081203125214 replace history file
    SQL2539W Warning! Restoring to an existing database that is the same as the
    backup image database. The database files will be deleted.
    Do you want to continue ? (y/n) y
    DB20000I The RESTORE DATABASE command completed successfully.
    $ db2 deactivate database hadrtest
    SQL1496W Deactivate database is successful, but the database was not
    activated.
    $ db2 start hadr on database hadrtest as standby
    DB20000I The START HADR ON DATABASE command completed successfully.

    COMMANDS ISSUED ON THE PRIMARY:
    $ db2 deactivate database hadrtest
    db2 start hadr on database hadrtest as primary
    SQL1496W Deactivate database is successful, but the database was not
    activated.
    $ db2 start hadr on database hadrtest as primary
    DB20000I The START HADR ON DATABASE command completed successfully.
    $ db2 activate db hadrtest
    SQL1490W Activate database is successful, however, the database has already
    been activated on one or more nodes.

    PRIMARY:
    $ db2pd -d hadrtest -hadr

    Database Partition 0 -- Database HADRTEST -- Active -- Up 0 days 16:47:34

    HADR Information:
    Role State SyncMode HeartBeatsMissed LogGapRunAvg (bytes)
    Primary Disconnected Sync 0 92630467

    ConnectStatus ConnectTime Timeout
    Disconnected Thu Dec 4 16:22:59 2008 (1228429379) 10

    LocalHost LocalService
    10.88.0.83 DB2_HADR_1

    RemoteHost RemoteService RemoteInstance
    10.88.0.84 DB2_HADR_2 db2mdes1

    PrimaryFile PrimaryPg PrimaryLSN
    S0000004.LOG 2379 0x000000729901B4DB

    StandByFile StandByPg StandByLSN
    S0000002.LOG 570 0x000000728ECCA708

    STANDBY:
    $ db2pd -d hadrtest -hadr

    Database HADRTEST not activated on database partition 0.

    Option -hadr requires -db <database> or -alldbs option and active database.

    We have no clue as to why the standby is not activated. The HADR section of the standby db config is:
    HADR database role = STANDBY
    HADR local host name (HADR_LOCAL_HOST) = 10.88.0.84
    HADR local service name (HADR_LOCAL_SVC) = DB2_HADR_2
    HADR remote host name (HADR_REMOTE_HOST) = 10.88.0.83
    HADR remote service name (HADR_REMOTE_SVC) = DB2_HADR_1
    HADR instance name of remote server (HADR_REMOTE_INST) = db2mdes1
    HADR timeout value (HADR_TIMEOUT) = 10
    HADR log write synchronization mode (HADR_SYNCMODE) = SYNC
    HADR peer window duration (seconds) (HADR_PEER_WINDOW) = 0

    The servers are on the same network switch and have no trouble communicating. We have followed directions to the letter.

    Anybody got some insights?

  2. #2
    Join Date
    May 2003
    Location
    USA
    Posts
    5,737
    Did you try to activate the standby db?

    Separately, I think you are making a big mistake with SYNCH mode. NEARSYNCH provides virutally same level of redundancy (if both servers fail at the same time, then you are out of luck anyway) and performs much better. I also think that HADR_TIMEOUT is a bit too low.
    M. A. Feldman
    IBM Certified DBA on DB2 for Linux, UNIX, and Windows
    IBM Certified DBA on DB2 for z/OS and OS/390

  3. #3
    Join Date
    Dec 2008
    Posts
    76
    This is the result:
    $ db2 activate db hadrtest
    DB20000I The ACTIVATE DATABASE command completed successfully.
    $ db2pd -d hadrtest -hadr

    Database HADRTEST not activated on database partition 0.

    Option -hadr requires -db <database> or -alldbs option and active database.

    The synch mode is not critical at this point as it is meant simply as a proof, not a production database, but I thank you for the advice. According to the docs, synch can be used without significant degradation if the servers are "in the same rack" (OK, this is IBM), but I'll take your suggestions in mind when we get to production.

  4. #4
    Join Date
    May 2003
    Location
    USA
    Posts
    5,737
    Just go through the process again of stopping HADR on both sides, doing the backup and the restore over again (take a new backup of the primary and don't just restore the same backup copy as before). Before you do the backup, make sure you are in archive log mode on the primary (db2 update db cfg using LOGARCHMETH1=DISK:/path) and restart instance to make sure it is in effect.

    Also check that the log path name (and archive path name) are the same on the primary and standby servers before you start. All the mount points/paths used by DB2 should be the same on both servers before you start.

    Check the db2diag.log for any error messages (<instance-home>/SQLLIB/DB2DUMP directory).

    If you still have problems, make sure there are no firewall issues with the HADR ports.

    Regarding other discussion:

    NEARSYNCH can be used with zero degradation since it usually takes longer to write the log to disk after a commit on the primary than the it takes for the primary to write the logs to the HADR buffer on the standby and receive acknowledgement from the standby. This assumes you have a decent network connection between the two, which you apparently do.

    You cannot loose any data on the standby with NEARSYNCH unless both servers go down at the exact same time. But if that happens, just bring the primary back up again (no reason to use standby if it also crashed) and DB2 goes into crash recovery and you have not lost any committed transactions. SYNCH mode is only for marketing purposes, it has not practical benefit if you think through the scenarios, and will affect performance to some degree.
    M. A. Feldman
    IBM Certified DBA on DB2 for Linux, UNIX, and Windows
    IBM Certified DBA on DB2 for z/OS and OS/390

  5. #5
    Join Date
    Dec 2008
    Posts
    76
    Didn't get back into this until this morning. The database is showing up as bad, so we are starting over from scratch. Thank you for your help. Hopefully, it will be as smooth as advertised once we get everything rebuilt. <12/11/2008> It did.
    Last edited by rdutton; 12-11-08 at 12:58.

  6. #6
    Join Date
    Feb 2011
    Location
    Barcelona
    Posts
    1

    What about deleting logs?

    The same thing has happened to me:

    When I started HADR on standby and before primary, I could see normal output with a standby log higher than that in primary. After starting it on primary I got the same output for secondary as rdutton and "Disconnected" at the primary.

    I could fix this by deleting the logs in the standby after the restore operation. I am no DB2 master so I cannot tell how sacrilegious is this, it just worked for me.

  7. #7
    Join Date
    Jan 2011
    Posts
    2
    Quote Originally Posted by rdutton View Post
    The situation:
    We are running DB2 v9.5 FP 2a on Solaris 10 (Sun T2000). We have created mirror copy installations on two servers following the step-by-step command line setup instructions in the Redbook "High Availability and Disaster Recovery Options for DB2 on Linux, UNIX, and Windows (August, 2008)". All seems to be fine up to and during the startup commands, but the systems never go into peer state.

    COMMANDS ISSUED ON THE STANDBY:
    $ db2 restore db hadrtest from /DB2/data/backup taken at 20081203125214 replace history file
    SQL2539W Warning! Restoring to an existing database that is the same as the
    backup image database. The database files will be deleted.
    Do you want to continue ? (y/n) y
    DB20000I The RESTORE DATABASE command completed successfully.
    $ db2 deactivate database hadrtest
    SQL1496W Deactivate database is successful, but the database was not
    activated.
    $ db2 start hadr on database hadrtest as standby
    DB20000I The START HADR ON DATABASE command completed successfully.

    COMMANDS ISSUED ON THE PRIMARY:
    $ db2 deactivate database hadrtest
    db2 start hadr on database hadrtest as primary
    SQL1496W Deactivate database is successful, but the database was not
    activated.
    $ db2 start hadr on database hadrtest as primary
    DB20000I The START HADR ON DATABASE command completed successfully.
    $ db2 activate db hadrtest
    SQL1490W Activate database is successful, however, the database has already
    been activated on one or more nodes.

    PRIMARY:
    $ db2pd -d hadrtest -hadr

    Database Partition 0 -- Database HADRTEST -- Active -- Up 0 days 16:47:34

    HADR Information:
    Role State SyncMode HeartBeatsMissed LogGapRunAvg (bytes)
    Primary Disconnected Sync 0 92630467

    ConnectStatus ConnectTime Timeout
    Disconnected Thu Dec 4 16:22:59 2008 (1228429379) 10

    LocalHost LocalService
    10.88.0.83 DB2_HADR_1

    RemoteHost RemoteService RemoteInstance
    10.88.0.84 DB2_HADR_2 db2mdes1

    PrimaryFile PrimaryPg PrimaryLSN
    S0000004.LOG 2379 0x000000729901B4DB

    StandByFile StandByPg StandByLSN
    S0000002.LOG 570 0x000000728ECCA708

    STANDBY:
    $ db2pd -d hadrtest -hadr

    Database HADRTEST not activated on database partition 0.

    Option -hadr requires -db <database> or -alldbs option and active database.

    We have no clue as to why the standby is not activated. The HADR section of the standby db config is:
    HADR database role = STANDBY
    HADR local host name (HADR_LOCAL_HOST) = 10.88.0.84
    HADR local service name (HADR_LOCAL_SVC) = DB2_HADR_2
    HADR remote host name (HADR_REMOTE_HOST) = 10.88.0.83
    HADR remote service name (HADR_REMOTE_SVC) = DB2_HADR_1
    HADR instance name of remote server (HADR_REMOTE_INST) = db2mdes1
    HADR timeout value (HADR_TIMEOUT) = 10
    HADR log write synchronization mode (HADR_SYNCMODE) = SYNC
    HADR peer window duration (seconds) (HADR_PEER_WINDOW) = 0

    The servers are on the same network switch and have no trouble communicating. We have followed directions to the letter.

    Anybody got some insights?
    1. Stop hadr on standby server.
    2. Move all logs to different directory.
    3. start hadr on standby after moving logs.
    4. activate db
    5. check hadr status by issuing this command -> db2pd -d <dbname> -hadr

    These steps work for me. Good luck..

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •