Results 1 to 6 of 6
  1. #1
    Join Date
    Mar 2003
    Posts
    97

    Unanswered: Updating FK due to dupe records

    Hello,

    I have a "contacts" table with some dupes that need to be deleted. However, there are references in other tables to these dupes so I need to consolidate them in the child tables first before I permanently remove them from the "contacts" table. Below is some sample data, table structure, and desired outcome:

    CONTACTS TABLE

    contact_id emailaddress
    1 jdoe@acme.com
    5 jdoe@acme.com
    40 jdoe@acme.com


    JOURNAL TABLE

    journal_id contact_id
    3 1
    62 1
    110 1
    124 5
    380 40
    395 5
    409 1

    DESIRED OUTCOME

    journal_id contact_id
    380 40
    409 40


    The closest I've gotten is this:
    UPDATE journal_JG SET
    contact_id = (SELECT MAX(contact_id) FROM contacts_JG GROUP BY emailAddress HAVING(COUNT(emailAddress)>1)
    FROM journal_JG INNER JOIN contacts_JG ON journal_JG.contact_id = contacts_JG.contact_id

    but this is clearly wrong.

    Can someone shed some light?

    TIA

  2. #2
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    so you're saying that journal 3, which right now has jdoe as its contact, will afterwards have no contact at all?

    that sounds wrong to me
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  3. #3
    Join Date
    Mar 2003
    Posts
    97
    Good catch. The updated desired outcome is below:

    DESIRED OUTCOME

    journal_id contact_id
    3 40
    62 40
    110 40
    124 40
    380 40
    395 40
    409 40

  4. #4
    Join Date
    Mar 2003
    Posts
    97
    Here's my most recent UPDATE statement:

    UPDATE journal_JG SET
    contact_id = (SELECT MAX(contacts_JG.contact_id) FROM contacts_JG INNER JOIN journal_JG ON journal_JG.contact_id = contacts_JG.contact_id GROUP BY contacts_JG.emailAddress HAVING(COUNT(contacts_JG.emailAddress)>1) )
    FROM journal_JG INNER JOIN contacts_JG ON journal_JG.contact_id = contacts_JG.contact_id

    I get this error, though:
    Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.
    The statement has been terminated.

  5. #5
    Join Date
    Nov 2004
    Posts
    1,427
    Provided Answers: 4
    This is what I could come up with. It should work, but i think it will be slow on large tables.
    Code:
    UPDATE #journal 
    SET contact_id = T.MaxContactId
    FROM #journal 
        INNER JOIN #CONTACTS ON
            #journal.contact_id = #CONTACTS.contact_id
        INNER JOIN (SELECT MAX(contact_id) as MaxContactId, 
                        emailAddress
                    FROM #contacts 
                    GROUP BY emailAddress
                    ) as T ON 
            #CONTACTS.emailAddress = T.emailAddress
    Better create a (temporary) translation table that stores the Replace_contact_id and ReplaceBy_contact_id (it will contain (1, 40), (5, 40), (40, 40), ...). Once you have that filled up as desired, the UPDATE will be easy to write and will execute a lot faster.
    With kind regards . . . . . SQL Server 2000/2005/2012
    Wim

    Grabel's Law: 2 is not equal to 3 -- not even for very large values of 2.
    Pat Phelan's Law: 2 very definitely CAN equal 3 -- in at least two programming languages

  6. #6
    Join Date
    Mar 2003
    Posts
    97
    Wim,

    Thanks for helping out! I had to tweak your SQL just a bit to make it work:

    UPDATE journal_JG
    SET contact_id = T.MaxContactId
    FROM journal_JG
    INNER JOIN contacts_JG ON
    journal_JG.contact_id = contacts_JG.contact_id
    INNER JOIN (SELECT MAX(contact_id) as MaxContactId, emailAddress
    FROM contacts_JG GROUP BY emailAddress HAVING(COUNT(emailAddress)>1)
    ) as T ON
    contacts_JG.emailAddress = T.emailAddress


    Thanks again!!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •