05-21-10, 13:59 #1Registered User
- Join Date
- Mar 2007
Unanswered: help with migrating data from one database to another
I have MySQL database which I have redesigned from scratch. The tables in the new database has some mandatory fields which the old database did not have
I need to migrate the data from the old database into the new one. The old database has several thousand records stored in different tables.
What is the best or most practical way to migrate this data so that the mandatory fields in the new database wont be a hinderance?
One of my problems is that in the old database user passwords were stored in an unencrypted format. The new database stores them using phps md5 hash in an encrypted format - that is whne a new user signs up the php script will encrypt the password and then store it in the new database.
If i was to migrate this information from the old database to the new they will end up being stored again in an unencrypted format.
Any help and guidance would be appreciated.
05-21-10, 15:38 #2Resident Curmudgeon
Provided Answers: 54
- Join Date
- Feb 2004
- In front of the computer
You need some kind of ETL (Extract-Transform-Load) process. In this case an ETL process can be as simple as a MySQL script. In other cases, it can be an application/scripts/etc and there are many vendors and open source systems that facilitate ETL projects.
There are many approaches, and only you can decide which is best for you. My suggestion would be to create a "staging table" with the same structure (columns and data types) as your destination table, but without the constraints. Transfer the data to the staging table, massage it as needed, make a backup of anything currently in your destination table, then repeat the try/fix/try again cycle as often as needed to get the data clean and into your destination.
-PatPIn theory, theory and practice are identical. In practice, theory and practice are unrelated.
05-27-10, 13:42 #3Registered User
- Join Date
- May 2010
I think Pat explained the approach nicely, so just a few recommendations from my side.
Before doing the try/fix/try cycle, do a little analysis in the source data for other data-related problems to save a few iterations . Look for duplicates, weird default values or broken data integrity. During systems lifecycle admins tend to drop constraints to facilitate data that were not counted on in the original model.
During try/fix cycle, enable constraints in target one by one, then re-load data to incrementally identify problem groups (they will be rejected by database).
Finally, I suggest doing audit counts to discover any rejected data. Also is possible check the app sitting on top of your database to see if it is affected.
In terms of the missing data, you have several options:
1. You can try to populate the missing data from other sources (e.g other system, other tables, have users type it in). Correct the data in stage, then load to target.
2. You are not able to correct the data. Then you have to fall back to default values (-1, 01/01/1970, etc.). If you use this approach DOCUMENT it, because the application/users must be ready for that!
Another possibility is not to migrate the records with missing data and place them to a reject file/table. Once again, backup and document that, as someone might come looking for the missing data.
I very much support Pat's suggestion to use ETL, but with my background I am a bit biased . While SQL scripting will definitely work if you just migrate within one database, consider using ETL tools if there is some complex data manipulation expected (e.g. one table is mapped to many, data must be validated/recalculated). Today's tools lets you visualize that process, so it is much easier to inspect, debug and understand the migration rules.
Regarding the MD5, either use the built-in md5() function in MySQL, or calculate it on the fly in your ETL job.