Results 1 to 6 of 6
  1. #1
    Join Date
    Jan 2016
    Posts
    3

    Answered: Need advice for Using archival logs for Disaster recovery testing.

    We have a production system in which the customer takes snapshot backup of the whole system(linux) at midnight. We had to do a disaster recovery test on the production system.
    Some background of our system:We have 8 databases, which have somewhat heavy transactions and file processing involved.
    To tackle the situation and save the money of HADR we decided to try,
    Scenario 1: I have enabled LOGARCHMETH1 to DISK:/Comit_Do_Not_Delete/DB_BKP/ for all the databases. And this saves the archival logs inside DB_BKP/DB2inst1/(all the 8 DBs log files).
    We have written a cronjob to copy /Comit_Do_Not_Delete/DB_BKP/* every 5 minutes to a remote server.
    Now we are planning to:
    1) Crash the server.
    2)Restore previous night snapshot.(This should have the data untill midnight)
    3)Copy the /Comit_Do_Not_Delete/DB_BKP/ from the remote server to our DR server(on which we are doing the activity).
    4) And do a rollforward to end of logs.

    Now the doubt that I have is, though we will loose atleast 5 mins of data, but will the DBs use the archival logs and rollforward?

    Scenario 2:

    Above scenario remains same.
    I will take online backup of all DBs with logs at midnight and copy to a different server along with the archival logs,and restore the DBs and rollforward.

    The doubt in this scenario I have is what would be the steps for this activity?
    1)I would be restoring the online backup, but would the DB have archival logs in the same path as the original?(/Comit_Do_Not_Delete/DB_BKP/)
    2)How would I do the rollforward for this scenario?


    PS::I am sorry if asked a silly question,as I have no resource/help to take in my organization except google.Using Google I am in a confused state as well as due to time constraints need a solution as soon as possible.

  2. Best Answer
    Posted by db2mor

    "Your first post in this thread mentioned that your company wishes to save the cost of the DB2 HADR? Can you explain more about that? (i.e. what is your edition of DB2 and how is it licensed, and how much money do you think is being saved (in other words is that cost worth 10-minutes of lost data?)

    The VM-snapshot-backup won't be valid for DB2-recovery unless you involve DB2 around the time of the VM-shapshot-backup (e.g write suspend/write resume as mark.bb mentioned). You still need regular online-DB2 backups.

    If the D.R node is accessible to the active-node via NFS then you can get DB2 to perform the log shipping by configuring LOGARCHMETH2 to reference the NFS-mounted D.R node. That avoids the delays of 30 minutes for log shipping that you mentioned. There is no need for DB2 to be running on the D.R node with that arrangement.

    You did not specify the RTO that the business requires (you implied an RPO of 10 minutes). You need to be able to prove that the RTO can be achieved by carefully rehearsing the D.R scenarios. In particular the length of time needed to startup the D.R instance, activate its database(s), perform the rollforwards especially if the time of the D.R event is at the end of the working day when there are potentially many log files produced since the previous snapshot-backup."


  3. #2
    Join Date
    Apr 2012
    Posts
    1,006
    Provided Answers: 16
    Define precisely what you mean by snapsnot-backup.
    Any *file-system* based backup cannot be used for recovery of a DB2 database if that database was ACTIVATED and busy with inflight transactions at the time of the filesystem-backup.
    If your snapshot-backup is not using DB2-services (e.g. ACS) then your D.R strategy is not valid.
    Consider instead taking an ONLINE db2-backup (including logs) regularly to a file-system that is *also* itself replicated to remote sites, in addition to log shipping as you already described.
    The D.R mechanism would then need to restore from the most recent DB2-backup and then rollforward through the logs.
    You did not define whether or not your business specified any Service Level Agreement for Recovery-Point-Objective or Recovery-Time-Objective - because you might find that the actual achievable RTO/RPO for your strategy might be a high cost to the business. The business should decide this, and not the technical people.

  4. #3
    Join Date
    Jul 2016
    Location
    Moscow
    Posts
    67
    Provided Answers: 7
    S1:
    If you do a file system level backup, you must use the technique (suspended I/O) described here:
    Using a split mirror as a backup image

    You must provide the transaction logs for rollforward operation in any case. There is a number of paths where DB2 can look for them itself during a rollforward operation. If you have these logs somewhere else you must specify this path in the OVERFLOW LOG PATH clause of the ROLLFORWARD command.
    Regards,
    Mark.

  5. #4
    Join Date
    Jan 2016
    Posts
    3
    Quote Originally Posted by db2mor View Post
    Define precisely what you mean by snapsnot-backup.
    Any *file-system* based backup cannot be used for recovery of a DB2 database if that database was ACTIVATED and busy with inflight transactions at the time of the filesystem-backup.
    If your snapshot-backup is not using DB2-services (e.g. ACS) then your D.R strategy is not valid.
    Consider instead taking an ONLINE db2-backup (including logs) regularly to a file-system that is *also* itself replicated to remote sites, in addition to log shipping as you already described.
    The D.R mechanism would then need to restore from the most recent DB2-backup and then rollforward through the logs.
    You did not define whether or not your business specified any Service Level Agreement for Recovery-Point-Objective or Recovery-Time-Objective - because you might find that the actual achievable RTO/RPO for your strategy might be a high cost to the business. The business should decide this, and not the technical people.

    Sorry db2mor for not being precise,its entire VM snapshot backup.

    So what you are suggesting is the online backup we are taking we move it to the secondary server nightly,along with the job which will ship the log every 30 mins from the primary server. But should the DB be active in the secondary server?
    The customer can bear with data loss of 10 mins.

  6. #5
    Join Date
    Apr 2012
    Posts
    1,006
    Provided Answers: 16
    Your first post in this thread mentioned that your company wishes to save the cost of the DB2 HADR? Can you explain more about that? (i.e. what is your edition of DB2 and how is it licensed, and how much money do you think is being saved (in other words is that cost worth 10-minutes of lost data?)

    The VM-snapshot-backup won't be valid for DB2-recovery unless you involve DB2 around the time of the VM-shapshot-backup (e.g write suspend/write resume as mark.bb mentioned). You still need regular online-DB2 backups.

    If the D.R node is accessible to the active-node via NFS then you can get DB2 to perform the log shipping by configuring LOGARCHMETH2 to reference the NFS-mounted D.R node. That avoids the delays of 30 minutes for log shipping that you mentioned. There is no need for DB2 to be running on the D.R node with that arrangement.

    You did not specify the RTO that the business requires (you implied an RPO of 10 minutes). You need to be able to prove that the RTO can be achieved by carefully rehearsing the D.R scenarios. In particular the length of time needed to startup the D.R instance, activate its database(s), perform the rollforwards especially if the time of the D.R event is at the end of the working day when there are potentially many log files produced since the previous snapshot-backup.

  7. #6
    Join Date
    Jan 2016
    Posts
    3
    Thanks for the suggestions mark,db2mor the exercise we were planning turned out to be somewhat fruitful,but we do need to spend money on the new server where we are planning to migrate our logs and online backups periodically.
    One issue we faced was that the VM snapshot which was restored after the server was crashed gave a downtime of approximately 2 hrs 14 mins, which is unacceptable by the customer. So the management now got the point we might be needing HADR solution.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •