Unanswered: Importing Text Files Preventing Duplicates
Hi Fellow members,
I need to create a VB code to import a .txt file into a Access table. But during
this process the routine has to garantee that no duplicate records are imported. Wich is the best way to complete this task?
I thought in reading one line at a time and query the access table to see if that record already exists. But we are talking about extra large .txt files.
Create the appropriate primary key on the table (this enforces uniqueness) and then, assuming you're processing this file line by line, try and run the insert (don't forget your error handling constructs) check failure code, if it's the one for primary key/duplicate insert violation then Todd's your uncle.
What if, in the destination table, the duplication is tetermined only by 3 fiels. Only data in those 3 from 12 fields can not be duplicated. In my code, and until your sugestion, I read one record at a time and query the table to see if there already exists this thata on those 3 fields.
On this assumption, can i ask you help?
1 - Wich is the best way ro read the file?
2 - Does your previous sugestion work in this case.
I think what George was suggesting is to create an index in the table to which you are importing this data. You can include the three fields that determine duplicates and specify a unique constraint on the index. That way your code doesn't do the duplicate checking, the table does. Then all your code has to do is trap the duplicate error, meaning you have to prepare your code to take over in the event a record is rejected by the table as a duplicate. Best of luck.
You got it JManSF!
Other options include putting the data from the textfile in a staging table and mimicking a merge / upsert query.
--Inserts (in this example col1 is the primary key)
INSERT INTO destination_table (col1, col2, col3)
WHERE col1 NOT IN (SELECT col1 FROM destination_table)
SET col1 = s.col1
, col2 = s.col2
, col3 = s.col3
FROM source_table s
JOIN destination_table d
ON s.col1 = d.col1
This method would be far more efficient.
Note: I wrote this SQL from my brainbox without testing, so there's no guarentees (at all!) that this is perfect
Final note: if you have duplicates in your source table, you won't necessarily get the results you want.