07-12-10, 11:53 #1Registered User
- Join Date
- Jul 2010
Unanswered: Need help merging data, access is for me?
I'm fairly new to databases, and was hired by a small company to get them set up with a CRM.
Long story short, they have multiple sets of data that all need to be consolidated into one giant, accuruate, and clean set of records.
The trouble is, there is a lot of overlap in the data, and certain fields have only been kept up-to-date in certain databases. For example, if there was a given record that was stored in several of the databases, and it needed to be updated with a new address, it would only be updated in one of the databases.
And I use the term 'databases' here very loosely, currently we have 3 or 4 very large excel files (2,000+ records), and a Microsoft Outlook data file.
So, my question is: What is the best way to consolidate these various files? I have to be able to give one database the priority in a given field, but a different database priority in something else. (for example, if Excel file A, and excel file B, both have information on the same customer, but the address and phone number don't match, use the Address from Excel file A, but use the phone number from Excel File B.)
And also, what program would you recommend for this task? My employer is willing to purchase Access, but wants to make sure its going to be able to do it first.
Thanks in Advance,
07-12-10, 12:01 #2King of Understatement
- Join Date
- Feb 2004
- One Flump in One Place
I would use a data cleaning company. They will give you a free check of the data and a report that will tell you what the output will be.
Writing something to do this task properly is not something for a newbie - I know because I helped write a commercial product that does just this. It took many hundreds of developer hours, was written in three products\ languages and is cross referenced to about 15 lists purchased from other organisations.Testimonial:
ur codings are working excelent.
07-19-10, 03:26 #3Registered User
- Join Date
- Oct 2004
- Melbourne, Australia
I have carried out this operation in the past, but it is extremely tricky.
Step 1 is to design your target database. To do this you must look at every field worth saving in the old databases and identify which table the data goes into. What may seem to be a unique item in the old database may in fact need to be put in a subsidiary table to allow for duplications between databases. You must move everything into the new database in the first instance, because you cannot be quite sure how structures and relationships might be affected if you omit any data.
Step 2 is to actually move the data into the new database. This is much more complicated than it sounds, because, you are going to need a lot of hierarchical operations, working your way down the tree of subsidiary tables.
This is usually done in VBA, simply because the selection rules across multiple source databases do not lend themselves to expression in SQL queries.
Step 3 is to remove unwanted or duplicated data.
This approach is feasible if you have the competent in-house staff who are familiar with all the systems, but if you are going to bring somebody in from outside, you may as well go for a specialist company, as recommended by pootle flump.
07-19-10, 05:32 #4Jaded Developer
Provided Answers: 59
- Join Date
- Nov 2004
- out on a limb
it can be done ion Access, or any other language or DB environment
however your real problem I suspect is knowing what is the correct current data, and what isn't . once you've cracked that then you can go ahead.
if you have had multiple people adding data to multiple sources I doubt any automated system would work.
however what you could do is import all the data into Access
decide which is 'the' most accurate and use that as the basis of your CRM. design a form which displays that data, and have two subforms which display the other data sources offering the user the opportunity to see the other possibilities. naturally you are going to have to match on surname or company name, in which case suing a Soundex or double metaphone match may be the best alternative. make the sub forms read only, so people can't change data there.
however I don't think this is for the faint hearted or beginnerI'd rather be riding on the Tiger 800 or the Norton
07-19-10, 12:57 #5Registered User
- Join Date
- May 2010
Since this is a one time task hopefully, I find the best way to clean it up is to use humans.
As previously suggested, I would import the data for each source and show it as "history" data. The users will be required to clean it up. And tt is great training for users.