Results 1 to 3 of 3
  1. #1
    Join Date
    May 2009
    Location
    Crete, Greece
    Posts
    90

    Unanswered: Delete duplications in relational db

    I have appended a load of data from an excel source into my relational db, and the relationships are working nicely. The main table is Customers, with Payments and Bookings tables linked to it with one-to-many relationships, with the one side being Customers. My problem now is that many Customers records are duplicated, so I need to delete the duplications without losing the one-to-many relationship from Payments and Bookings to each customer.

    Any guidance would be appreciated.

  2. #2
    Join Date
    Mar 2009
    Posts
    5,442
    Provided Answers: 14
    Here is a solution:

    In this example I have two tables: Tbl_Customers_Groups and Tbl_Customers_Plants.

    - Tbl_Customers_Groups has a primary key column: SysCounter and a column Customer_Code (among other columns).

    - Tbl_Customers_Groups has several rows with the same Customer_Code value and I want to keep only one row for each
    Customer_Code value (the one with the lowest SysCounter value).

    - Tbl_Customers_Plants has a foreign key column FK_Customers_Groups that creates a One (Tbl_Customers_Groups.SysCounter)
    to Many (Tbl_Customers_Plants.FK_Customers_Groups) relationship.

    - When eliminating the duplicated rows in Tbl_Customers_Groups, each row in the FK_Customers_Groups column of the
    Tbl_Customers_Plants table will be updated to contain the matching SysCounter value of the remaining row in the
    Tbl_Customers_Groups table, so that the relationship between both tables will be kept.

    Code:
    Function EliminateDuplicateFK()
    
        Dim rstDistinct As DAO.Recordset
        Dim rstPK As DAO.Recordset
        Dim strSQL As String
        Dim strIn As String
        Dim lngFK As Long
    '
    ' Retrieve the distinct Customer_Code values.
    '
        strSQL = "SELECT DISTINCT Customer_Code FROM Tbl_Customers_Groups"
        Set rstDistinct = CurrentDb.OpenRecordset(strSQL, dbOpenSnapshot, dbSeeChanges)
    '
    ' With each distinct Customer_Code.
    '
        With rstDistinct
            Do Until .EOF
    '
    ' Retrieve all SysCounter values having the current Customer_Code.
    '
                strSQL = "select SysCounter FROM Tbl_Customers_Groups " & _
                    "WHERE Customer_Code = '" & !Customer_Code & "' " & _
                    "ORDER BY SysCounter"
                Set rstPK = CurrentDb.OpenRecordset(strSQL, dbOpenSnapshot, dbSeeChanges)
    '
    ' Create the IN string (a list of SysCounter values: Val1, Val2, ... ,ValN) for the current set.
    '
                With rstPK
                    lngFK = !SysCounter ' Save the lowest SysCounter value of the current set.
                    strIn = ""
                    Do Until .EOF
                        If Len(strIn) Then strIn = strIn & ", "
                        strIn = strIn & CStr(!SysCounter)
    '
    ' Next SysCounter value
    '
                        .MoveNext
                    Loop
                    Close
                End With
    '
    ' Build the SQL UPDATE and execute it.
    '
                strSQL = "UPDATE Tbl_Customers_Plants " & _
                    "SET FK_Customers_Groups = " & lngFK & " " & _
                    "WHERE FK_Customers_Groups IN (" & strIn & ")"
                CurrentDb.Execute strSQL, dbSeeChanges
    '
    ' Next Customer_Code value.
    '
                .MoveNext
            Loop
            .Close
        End With
    '
    ' Clean up.
    '
        Set rstPK = Nothing
        Set rstDistinct = Nothing
    '
    ' Build the sub-query string
    ' (find rows in Tbl_Customers_Groups with SysCounter having no matching FK_Customers_Groups in Tbl_Customers_Plants).
    '
        strSQL = "SELECT Tbl_Customers_Groups.SysCounter " & _
            "FROM Tbl_Customers_Groups " & _
            "LEFT JOIN Tbl_Customers_Plants ON Tbl_Customers_Groups.SysCounter = Tbl_Customers_Plants.FK_Customers_Groups " & _
            "WHERE (((Tbl_Customers_Plants.FK_Customers_Groups) Is Null))"
    '
    ' Build the DELETE query and execute it.
    ' (delete rows in Tbl_Customers_Groups when SysCounter has no matching FK_Customers_Groups in Tbl_Customers_Plants).
    '
        strSQL = "DELETE * FROM Tbl_Customers_Groups " & _
            "WHERE Tbl_Customers_Groups.SysCounter IN (" & strSQL & ")"
        CurrentDb.Execute strSQL, dbSeeChanges
        
    End Function
    Have a nice day!

  3. #3
    Join Date
    May 2009
    Location
    Crete, Greece
    Posts
    90
    Thank you.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •