Results 1 to 10 of 10
  1. #1
    Join Date
    Dec 2005
    Posts
    67

    Unanswered: Deduplication in Access

    Hi

    I need to get rid of duplicates in an Access table. The 'find duplicates' query wizard is not doing what I want as the whole table is duplicated. Does anyone have any suggestions/solutions/ideas?

    Thanks for your help.

  2. #2
    Join Date
    Aug 2004
    Location
    Cary, NC
    Posts
    264
    If the find duplicates query is finding the duplicate records and you really don't care which one of them you delete you can do this in VBA.
    (it is very rough but it worked in my situation)
    Code:
        Dim deleteThese As String
        Dim pinCurr As Long
        Dim pinLast As Long
        Dim pinDelete As Long
                    
        Set db = CurrentDb
        
        qSql = "{The SQL from the find Duplicates Query} "   ' <---  ADD ORDER BY on duplicated fields to the SQL
        Set rs = db.OpenRecordset(qSql)
        
        'Loop through all the records in the find duplicates query
        Do While Not (rs.EOF)
            
            'rs![theIndex] is the unique field in the table with duplicates
            pinCurr = rs![theIndex]
            
            If (pinLast = pinCurr) Then
                'This will be the second duplicate, add it to the delete list
                pinDelete = rs![theIndex]
                deleteThese = deleteThese & pinDelete & ","
            Else
                'First time this index is found
                pinLast = pinCurr
            End If
            
            rs.MoveNext
        Loop
            If (deleteThese = "") Then GoTo notinDb
            
            'Remove the last comma from the string
            deleteThese = Left(deleteThese, Len(deleteThese) - 1)
            
            dSql = "DELETE from {tableName} WHERE theIndex in (" & deleteThese & ")"
            db.Execute (dSql)
        
    notinDb:
    You would have to adapt it a bit but it might help with the logic.

    Steve

  3. #3
    Join Date
    Dec 2005
    Posts
    67

    Deduplication

    Thanks - will try and let you know what happened.

  4. #4
    Join Date
    Jul 2003
    Location
    Michigan
    Posts
    1,941
    Another way would be to add an autonumber field to the table. Once it is saved, you will have a unique field for each record. After that, you can create a "delete duplicates" query.
    Inspiration Through Fermentation

  5. #5
    Join Date
    Dec 2004
    Location
    Madison, WI
    Posts
    3,926
    If I don't care about the autonumber/primary key and relationships, what I do is make 2 Find Duplicates query, one of them I change to a make-table query and make it unique (distinct) so only unique records show (ie. changing Unique values to yes for the query properties and unchecking the show button in the query for the autonumber and any other fields which cause duplication). I make the new unique table. I then run the other Find Duplicates query and delete all the records returned in that query. Then I simply append the unique table back to the main data table. Thus, no code writing and it takes me about 2 minutes to do it.

    If I have relationships to the autonumber/primary key field, then it becomes more complex.
    Last edited by pkstormy; 07-30-07 at 15:43.
    Expert Database Programming
    MSAccess since 1.0, SQL Server since 6.5, Visual Basic (5.0, 6.0)

  6. #6
    Join Date
    Aug 2004
    Location
    Cary, NC
    Posts
    264
    RedNeckGeek and pkstormy definitely have the easier approach. Now that they reminded me, I recall using a similar method in the past (perhaps on a day I was thinking more clearly).

    Make a structure only copy of the table with duplicates, and create a unique index on the duplicated field(s). Make the find duplicates an append query into this tempTable. The unique index (which does not exist in the orignal table) will cause append errors for all the duplicated records. This table will now show only one of the duplicated records and it will be the record that will be deleted. (also a nice backup if you desire)

    Then create a delete query between the newly created table and the table with the duplicates linking the tables with the index field.

    You will end up with a table without duplicates and a second table that contains the deleted duplicates.

    Thanks for reminding me of an easier way

    Steve

  7. #7
    Join Date
    Dec 2004
    Location
    Madison, WI
    Posts
    3,926
    Steve,

    You don't need to really worry about creating an index or unique key. Just make the unique records table from one of the duplicates query by changing it's properties from a select to a make-table and setting the Unique Values to Yes (don't remove the criteria line in this query for retrieving duplicate values). You're sort of fooling the query where it says "return all duplicated records but return them uniquely and make a table with them." Make your table of unique records. Delete the records in the 2nd duplicates query (which shows all the duplicated records), and append the records from the unique table to your main table.

    Whether you have a unique key or not, it doesn't really matter doing it like this.
    Last edited by pkstormy; 07-30-07 at 16:32.
    Expert Database Programming
    MSAccess since 1.0, SQL Server since 6.5, Visual Basic (5.0, 6.0)

  8. #8
    Join Date
    Dec 2005
    Posts
    67

    Deduplication

    thanks every one. Really appreciate your help. Used sps's vba method, but I have stored the other answers for future reference. thank you much. I'm sure I'll be back with more queries.

    Just an aside, is just anyone allowed to answer queries or does it have to be a moderator? for instance, if I see a query I'm able to help with, can I do that or is that disallowed?

  9. #9
    Join Date
    Dec 2004
    Location
    Madison, WI
    Posts
    3,926
    tasha123,

    Answer as many questions in the posts as you like. There's no rule on who can and cannot answer questions. It's always welcome to get different points of view from other users and feel free to use the experience you have to answer any questions. The only advice I can give you is try not to be critical of someone else's answers if you have a better way to do it.
    Last edited by pkstormy; 08-04-07 at 03:56.
    Expert Database Programming
    MSAccess since 1.0, SQL Server since 6.5, Visual Basic (5.0, 6.0)

  10. #10
    Join Date
    Dec 2005
    Posts
    67
    Ok - thanks.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •