If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > MySQL > Re: Updating a database based on another database

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 08-02-07, 19:15
thusi02 thusi02 is offline
Registered User
 
Join Date: Aug 2007
Posts: 1
Re: Updating a database based on another database

Hi all,

The following is my dilemma:

I have two databases. One DB2 named ServerA and another Mysql named ServerB. I would like to take a certain number of tables from ServerA and check to see if there are any differences in the fields from the same tables in ServerB. If there are differences than update ServerB with the changes. Can anyone tell me how I can approach this rather than checking each row and checking if they are identical etc. Please note the tables are ENORMOUS in size. Also just to clarify, no I cannot access the DB2 server directly in an app etc so have to approach it this way.

Thank you
Regards,

Nathan.
Reply With Quote
  #2 (permalink)  
Old 08-03-07, 03:55
aschk aschk is offline
Registered User
 
Join Date: Mar 2007
Location: 636f6d7075746572
Posts: 770
You're going to need to do this in a program, and most probably row by row because there is no common interface (that I am aware of) between DB2 and MySQL. i.e. you cannot do SELECT * FROM MySQL.table WHERE NOT EXISTS (SELECT * FROM DB2.table)
Reply With Quote
  #3 (permalink)  
Old 08-04-07, 05:03
mike_bike_kite mike_bike_kite is offline
vaguely human
 
Join Date: Jun 2007
Location: London
Posts: 2,519
You'll need a program but you can avoid looking at every row by adding a timestamp field to both tables. Update the timestamp on every insert and update. Then periodically run the following process to sync the tables.
  • Extract all records where timestamp greater or equal to LAST_DONE.
  • Extract these records out into a file.
  • Load this file into a transfer table on the other database.
  • Update or insert these records into the main table on the other database.
  • Set LAST_DONE = max timestamp in your transfer table.
  • Clear the transfer table.

The above should be pretty quick to run and, if run every 30min, would keep the two tables reasonably closely in sync. Obviously this won't cover deletion of records but, with a little imagination, you could code for that. You might want to have a maximum number of records transferred ie 10k - which might allow you to drip feed the data in initially - there would be issues if many records have the same timestamp though.

I've used the above process to keep complete databases (of different types) in sync and it works reasonably well. You'll also want to monitor the process to make sure everything is still running.

Mike
Reply With Quote
  #4 (permalink)  
Old 08-04-07, 06:04
gvee gvee is offline
www.gvee.co.uk
 
Join Date: Jan 2007
Location: UK
Posts: 10,156
I don't agree that you'll need a program, but either way there is not a perfect solution to this question that I can think of.

My first thoughts are - copy TableA from ServerA to a temporary table on ServerB. Then your comparisons can simply be done via SQL with the utilization of the correct JOIN implementation.

Obviously, the flaw in this method is
Quote:
Originally Posted by thusi02
the tables are ENOURMOUS
After reading Mikes post however, I must raise one question. Does this process need to be running real-time or would it be acceptable to have the data checked every hour/day/week*

*delete as appropriate
__________________
George
Twitter | Blog
Reply With Quote
  #5 (permalink)  
Old 08-04-07, 09:22
mike_bike_kite mike_bike_kite is offline
vaguely human
 
Join Date: Jun 2007
Location: London
Posts: 2,519
Quote:
My first thoughts are - copy TableA from ServerA to a temporary table on ServerB. Then your comparisons can simply be done via SQL with the utilization of the correct JOIN implementation ... Obviously, the flaw in this method is ... the tables are ENOURMOUS
Which is why this method should best be ignored

Quote:
I don't agree that you'll need a program
So apart from being extremely slow and inefficient - how would this method run without a controlling program?

Quote:
After reading Mikes post however, I must raise one question. Does this process need to be running real-time or would it be acceptable to have the data checked every hour/day/week*
The method I proposed was to be run periodically which means it could be run at any frequency the user wants (5 min, 30 min, an hour, a day ... ). Obviously you'll want an index on the timestamp field to ensure good performance. If you do need the data in sync closer than a minute then we might need to rethink.

Mike
Reply With Quote
  #6 (permalink)  
Old 08-04-07, 14:52
gvee gvee is offline
www.gvee.co.uk
 
Join Date: Jan 2007
Location: UK
Posts: 10,156
I think the OP may have to re-define what he means by enormous. Enormous to some is 10K records, in which case I would still put my idea forward.

Information about how often this sync needs performing would also be useful.

The only qualms I have with your suggestion Mike, is this part
Quote:
Originally Posted by mike_bike_kite
I've used the above process to keep complete databases (of different types) in sync and it works reasonably well.
__________________
George
Twitter | Blog
Reply With Quote
  #7 (permalink)  
Old 08-04-07, 15:29
mike_bike_kite mike_bike_kite is offline
vaguely human
 
Join Date: Jun 2007
Location: London
Posts: 2,519
I agree - we do need some clarification on how closely the tables need to be synchronized (ie within the minute, hour, day) and it would be good to have a real value for enormous - standard definitions for enormous are:
  • enormos < 10K rows
  • Enormous < 100K rows
  • ENORMOUS > 5m
"it works reasonably well" just meant it did the job as required and didn't ever fall over.

If you need to do more than one table then store the LAST_DONE timestamps in a table along with the names of the tables being transferred. The chief advantage of this approach is that it doesn't matter at all if the program or either server dies as nothing will ever be lost. Also please note I should of put the clear transfer table at the start of the process rather than at the end.

Mike
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On