Results 1 to 13 of 13
  1. #1
    Join Date
    Jul 2010
    Posts
    7

    Unanswered: Join a table to itself with no specfic ID

    Join a table to itself with no specfic ID

    I have a table of customer records which has no ID column, a customer may appear in the table more than once with various of the columns populated. I have managed to build a column INIT_SURNAME which is fully poplated but not unique to a customer. How can I merge the rows for matching customers where they have the same INIT_SURNAME and share one other common data element. For example I want to return 3 cutomers from the following table, the 1st 3 rows are the same person.

    e.g

    INPUT:
    INIT_SURNAME|EMAIL|PHONE|OTHERID
    J*SOAP|js@x.com|4321|
    J*SOAP|js@x.com||
    J*SOAP||4321|4
    J*SOAP|otherjoe@y.com||2
    J*DOE||12345|3

    Output:
    INIT_SURNAME|EMAIL|PHONE|OTHERID
    J*SOAP|js@x.com|4321|4
    J*SOAP|otherjoe@y.com||2
    J*DOE||12345|3

    Can this be done?

  2. #2
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    Code:
    ....
            customer AS c1
    INNER JOIN 
            customer AS c2
    ON  c2.INIT_SURNAME     = c1.INIT_SURNAME
    AND
        (
            c2.EMAIL        = c1.EMAIL
        OR
            c2.PHONE        = c1.PHONE
    .....
        )

  3. #3
    Join Date
    Jul 2010
    Posts
    7
    Not sure I understand, that will not return the resultset I am looking for.

  4. #4
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    Sorry - you are right - I have reread the question more closely.

    I think I know a solution but would need to test, not write air code.

    Please can you change you sample data to include DDL for the table and DML to insert the data.

  5. #5
    Join Date
    Jul 2010
    Posts
    7
    air code lol...

    apologies I should have included that in the beginning.

    create table
    customer
    (
    INIT_SURNAME varchar(20),
    EMAIL varchar(20),
    PHONE varchar(20),
    OTHERID varchar(20),
    )

    insert into customer
    values('J*SOAP','js@x.com','4321','');
    insert into customer
    values('J*SOAP','js@x.com','','');
    insert into customer
    values('J*SOAP','','4321','4');
    insert into customer
    values('J*SOAP','otherjoe@y.com','','2');
    insert into customer
    values('J*DOE','','12345','3');

  6. #6
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    Ok - I have to give up. I've spent some time writing a dog's dinner that doesn't quite get the right answer. I think to do so I would need to be able to include expressions in my ROW_NUMBER() call.

    Thoughts - I am struggling to do this without something procedural or using some temp tables really. The stuff we have here for this sort of thing is pseudo procedural (uses lots of CLR) and does a lot of fuzzy matching.

    Hmmm - I've just had another idea for a solution as I type this but no time to test

  7. #7
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    This is a common problem when dealing with non-normalized tables. The problem is deciding just what to do with conflicts such as four different EMAIL values, two PHONE values, and a half dozen OTHERID values for a specific INIT_SURNAME. Do you want to see all of them in all combinations, only the newest (and how does SQL determine which is newest), the most common, or something different... Just about every time I get this kind of request, the handling of conflicts is different.

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  8. #8
    Join Date
    Jul 2010
    Posts
    7
    Thanks for the effort so far

    At this stage I would be happy to get a single "complete" row per customer, not really worried about dropping the occasional value, creating a pseudo newest if you like.

    Am also tending towards the temp table route

  9. #9
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    Code:
    SELECT INIT_SURNAME
    ,  MAX(EMAIL)
    ,  MAX(PHONE)
    ,  MAX(OTHERID)
       FROM customer
       GROUP BY INIT_SURNAME
    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  10. #10
    Join Date
    Jul 2010
    Posts
    7
    Nope, afraid not that would only return 2 rows and not 3

    Quote Originally Posted by teebag View Post
    INIT_SURNAME which is fully poplated but not unique to a customer. How can I merge the rows for matching customers where they have the same INIT_SURNAME and share one other common data element.

  11. #11
    Join Date
    Jul 2010
    Posts
    7
    Quote Originally Posted by teebag View Post
    where they have the same INIT_SURNAME and share one other common data element
    Quote Originally Posted by teebag View Post
    get a single "complete" row per customer
    i.e at least 2 columns have to match to say it is the same customer, in this case init_surname and any other column, in the example there are 2 J*SOAP's

    J*SOAP|js@x.com|4321|4
    J*SOAP|otherjoe@y.com||2

  12. #12
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    Code:
    create table
    customer
    (
    INIT_SURNAME varchar(20),
    EMAIL varchar(20),
    PHONE varchar(20),
    OTHERID varchar(20),
    )
    
    insert into customer
    values('J*SOAP','js@x.com','4321','');
    insert into customer
    values('J*SOAP','js@x.com','','');
    insert into customer
    values('J*SOAP','','4321','4');
    insert into customer
    values('J*SOAP','otherjoe@y.com','','2');
    insert into customer
    values('J*DOE','','12345','3');
    
    SELECT
       IDENTITY(INT, 1, 1) AS id, *
       INTO #foo
       FROM customer
    
    SELECT *
       FROM #foo AS a
       JOIN #foo AS b
          ON (a.INIT_SURNAME = b.INIT_SURNAME
          AND a.id < b.id)
       WHERE a.EMAIL = b.EMAIL
          OR a.OTHERID = b.OTHERID
          OR a.PHONE = b.PHONE
    
    DROP TABLE #foo
    DROP TABLE customer
    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  13. #13
    Join Date
    Jul 2010
    Posts
    7
    PatP you're a star, got it working.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •