Results 1 to 7 of 7
  1. #1
    Join Date
    Aug 2007
    Posts
    1

    Unanswered: Re: Updating a database based on another database

    Hi all,

    The following is my dilemma:

    I have two databases. One DB2 named ServerA and another Mysql named ServerB. I would like to take a certain number of tables from ServerA and check to see if there are any differences in the fields from the same tables in ServerB. If there are differences than update ServerB with the changes. Can anyone tell me how I can approach this rather than checking each row and checking if they are identical etc. Please note the tables are ENORMOUS in size. Also just to clarify, no I cannot access the DB2 server directly in an app etc so have to approach it this way.

    Thank you
    Regards,

    Nathan.

  2. #2
    Join Date
    Mar 2007
    Location
    636f6d7075746572
    Posts
    770
    You're going to need to do this in a program, and most probably row by row because there is no common interface (that I am aware of) between DB2 and MySQL. i.e. you cannot do SELECT * FROM MySQL.table WHERE NOT EXISTS (SELECT * FROM DB2.table)

  3. #3
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    You'll need a program but you can avoid looking at every row by adding a timestamp field to both tables. Update the timestamp on every insert and update. Then periodically run the following process to sync the tables.
    • Extract all records where timestamp greater or equal to LAST_DONE.
    • Extract these records out into a file.
    • Load this file into a transfer table on the other database.
    • Update or insert these records into the main table on the other database.
    • Set LAST_DONE = max timestamp in your transfer table.
    • Clear the transfer table.


    The above should be pretty quick to run and, if run every 30min, would keep the two tables reasonably closely in sync. Obviously this won't cover deletion of records but, with a little imagination, you could code for that. You might want to have a maximum number of records transferred ie 10k - which might allow you to drip feed the data in initially - there would be issues if many records have the same timestamp though.

    I've used the above process to keep complete databases (of different types) in sync and it works reasonably well. You'll also want to monitor the process to make sure everything is still running.

    Mike

  4. #4
    Join Date
    Jan 2007
    Location
    UK
    Posts
    11,434
    Provided Answers: 10
    I don't agree that you'll need a program, but either way there is not a perfect solution to this question that I can think of.

    My first thoughts are - copy TableA from ServerA to a temporary table on ServerB. Then your comparisons can simply be done via SQL with the utilization of the correct JOIN implementation.

    Obviously, the flaw in this method is
    Quote Originally Posted by thusi02
    the tables are ENOURMOUS
    After reading Mikes post however, I must raise one question. Does this process need to be running real-time or would it be acceptable to have the data checked every hour/day/week*

    *delete as appropriate
    George
    Home | Blog

  5. #5
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    My first thoughts are - copy TableA from ServerA to a temporary table on ServerB. Then your comparisons can simply be done via SQL with the utilization of the correct JOIN implementation ... Obviously, the flaw in this method is ... the tables are ENOURMOUS
    Which is why this method should best be ignored

    I don't agree that you'll need a program
    So apart from being extremely slow and inefficient - how would this method run without a controlling program?

    After reading Mikes post however, I must raise one question. Does this process need to be running real-time or would it be acceptable to have the data checked every hour/day/week*
    The method I proposed was to be run periodically which means it could be run at any frequency the user wants (5 min, 30 min, an hour, a day ... ). Obviously you'll want an index on the timestamp field to ensure good performance. If you do need the data in sync closer than a minute then we might need to rethink.

    Mike

  6. #6
    Join Date
    Jan 2007
    Location
    UK
    Posts
    11,434
    Provided Answers: 10
    I think the OP may have to re-define what he means by enormous. Enormous to some is 10K records, in which case I would still put my idea forward.

    Information about how often this sync needs performing would also be useful.

    The only qualms I have with your suggestion Mike, is this part
    Quote Originally Posted by mike_bike_kite
    I've used the above process to keep complete databases (of different types) in sync and it works reasonably well.
    George
    Home | Blog

  7. #7
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    I agree - we do need some clarification on how closely the tables need to be synchronized (ie within the minute, hour, day) and it would be good to have a real value for enormous - standard definitions for enormous are:
    • enormos < 10K rows
    • Enormous < 100K rows
    • ENORMOUS > 5m
    "it works reasonably well" just meant it did the job as required and didn't ever fall over.

    If you need to do more than one table then store the LAST_DONE timestamps in a table along with the names of the tables being transferred. The chief advantage of this approach is that it doesn't matter at all if the program or either server dies as nothing will ever be lost. Also please note I should of put the clear transfer table at the start of the process rather than at the end.

    Mike

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •