I'm fairly new to MySQL and am trying to do some data modeling for an application I'm trying to build. I'm caught in what I'm guessing is a common problem: trying to consolidate user-entered data from the past with data that, had it been present to be selected when the data was entered, would have been selected from the database by the user. For example, I want to give the users the option to enter a record (eg., common profile info like name, address, etc. of a place), but also will have these same places entering their information at later time. And I want to be able to consolidate the data at some point in the future without losing the user's dependencies on that record and the place gaining the value of being able to see the data of the user.
Let me explain a little clearer:
User enters this profile info to keep track of transactions that they have with the deli:
Central Park Deli
324 Central Park West
New York, NY 10001
The users then uses that data to record transactions made with Central Park Deli over the course of many months. Months later, though, Central Park Deli decides to start using the application and enters the same (or very similar) information when they started using the application. The Deli enters:
Central Park Deli, Inc.
New York, NY 10001
As administrator of the application, I ultimately want to be able to associate the user's transaction records with the record that Central Park Deli, Inc. created after those transactions were completed. While I would love this to be an automated process, I'm perfectly happy doing it manually through an consolidation interface that I'm hoping to build in PHP.
So my question is: is this possible? What is the best way to go about it? I'm guessing this risks some data consistency problems, and how serious are they? Finally, what is the best alternative?
What you have is an Extract-Transform-Load (ETL) problem. It also goes by the moniker of database scrubbing. You are trying to transform data to make it consistent. You will have to create a transform database to handle all the distinct values each column contains with a column of the accepted values.
For example a transform table may contain the following:
You will have to write SQL scripts to take data from the existing source database, perform the transforms and insert the transformed values into the target database. This will ensure your new databases consistency.