Results 1 to 15 of 15
  1. #1
    Join Date
    Oct 2001
    Location
    Bangalore
    Posts
    186

    Unanswered: Complex query problem, help needed

    Hi All,
    We have a table in our database (MS SQL) where in no primary key defined.
    Table name is NAMELINK.
    Structure is as follows :
    NameId1 int 4 not null
    Category char 5 nullable
    Relation12 char 20 nullable
    NameId2 int 4 not null
    Relation21 char 21 nullable
    Remarks char 30 nullable
    Dependent smallint 2 not null

    While conversion we are assuming that NameId1, NameId2 is a
    composite primary key (looking at the data majorly it is unique). But
    still there are some cases where it can not be unique.
    Now we want a select query to fetch only those records where the
    combination of NameId1, NameId2 is unique. We tried self join but
    somehow it's not working.
    Please help about this.

    Regards,
    Prashant
    Prashant

  2. #2
    Join Date
    May 2002
    Posts
    18

    Re: Complex query problem, help needed

    Try this query

    select * from NAMELINK inner join ( select distinct NameId1, NameId2 from NAMELINK) NL on NAMELINK.NameId1 = NL.NameId1 and NAMELINK.NameId2 = NL.NameId2

    I have written the query on the fly without testing it. So test it and post your reply.





    Originally posted by dahalkar_p
    Hi All,
    We have a table in our database (MS SQL) where in no primary key defined.
    Table name is NAMELINK.
    Structure is as follows :
    NameId1 int 4 not null
    Category char 5 nullable
    Relation12 char 20 nullable
    NameId2 int 4 not null
    Relation21 char 21 nullable
    Remarks char 30 nullable
    Dependent smallint 2 not null

    While conversion we are assuming that NameId1, NameId2 is a
    composite primary key (looking at the data majorly it is unique). But
    still there are some cases where it can not be unique.
    Now we want a select query to fetch only those records where the
    combination of NameId1, NameId2 is unique. We tried self join but
    somehow it's not working.
    Please help about this.

    Regards,
    Prashant

  3. #3
    Join Date
    Oct 2001
    Location
    Bangalore
    Posts
    186
    Hi Mohamed,
    Sorry to say that this is not going to work for me..

    I need to remove the records which are duplicating this way:

    Nameid1 Nameid2
    1 2
    2 1

    I need just one of the following record not both..

    Prashant
    Prashant

  4. #4
    Join Date
    May 2002
    Posts
    18


    Hmm... Seems to be pretty interesting problem. I had been struggling like hell to find a solution to it for about 4 hours continously. My objectives were to avoid temporary tables and avoid cursors; To solve by a single query. Unfortunately I could not avoid correlated query (in the second part of UNION keyword) which I hate because of performance reasons. If number the duplicates as u said were low then the performance should be good. Performance decreases as the number of duplicates increases. Anyway here goes the solution.


    SELECT NAMELINK.* FROM NAMELINK LEFT JOIN
    (
    SELECT nl1.* FROM NAMELINK nl1
    INNER JOIN NAMELINK nl2 on nl1.Nameid2 = nl2.Nameid1 and nl1.Nameid1= nl2.Nameid2
    ) MyNameLink ON Namelink.Nameid1 = MyNameLink.Nameid1 and Namelink.Nameid2 = MyNameLink.Nameid2
    WHERE MyNameLink.Nameid1 IS NULL AND MyNameLink.Nameid2 IS NULL

    UNION ALL

    SELECT nl2.* FROM NAMELINK nl1
    INNER JOIN NAMELINK nl2 on nl1.Nameid2 = nl2.Nameid1 and nl1.Nameid1= nl2.Nameid2
    WHERE
    nl2.Nameid1 =
    ( SELECT TOP 1 nl3.Nameid1 FROM NAMELINK nl3 INNER JOIN NAMELINK nl4 ON nl3.Nameid2 = nl4.nameid1 and nl3.Nameid1 = nl4.Nameid2 WHERE (nl3.Nameid1 + nl3.Nameid2) = (nl1.Nameid1 + nl1.Nameid2) ) and
    nl2.Nameid2 =
    ( SELECT TOP 1 nl3.Nameid2 FROM NAMELINK nl3 INNER JOIN NAMELINK nl4 ON nl3.Nameid2 = nl4.nameid1 and nl3.Nameid1 = nl4.Nameid2 WHERE (nl3.Nameid1 + nl3.Nameid2) = (nl1.Nameid1 + nl1.Nameid2) )



    I tested it thoroughly and validated the solution. Just copy, paste it for execution and tell me the result.

    Good Luck.




    Originally posted by dahalkar_p
    Hi Mohamed,
    Sorry to say that this is not going to work for me..

    I need to remove the records which are duplicating this way:

    Nameid1 Nameid2
    1 2
    2 1

    I need just one of the following record not both..

    Prashant
    Last edited by Mohamed Yousuff; 11-08-02 at 06:52.

  5. #5
    Join Date
    Oct 2001
    Location
    Bangalore
    Posts
    186
    Cheers!!!!
    You have done it..
    Thank's a lot man....

    Prashant
    Prashant

  6. #6
    Join Date
    May 2002
    Posts
    18


    I would be glad if you could tell me the number of rows in the namelink table and your comments on the performance of the solution. ( I am much worried about performance in all my solutions).



    Originally posted by dahalkar_p
    Cheers!!!!
    You have done it..
    Thank's a lot man....

    Prashant

  7. #7
    Join Date
    Oct 2001
    Location
    Bangalore
    Posts
    186
    Right now i have a database with small number of records in the table.
    They are around 250. The performance test can not be done on such small db.
    I will let you know when it is done on a live database with lot's of records.

    -Prashant
    Prashant

  8. #8
    Join Date
    Nov 2002
    Location
    South Africa
    Posts
    9

    Talking Re: Complex query problem, help needed

    select NameId1, NameId2 from NAMELINK group by NameId1, NameId2
    having count(*) = 1

    run this query - it should bring back some positive information!!!

    Enjoy,

    Neil de Later,
    Johannesburg - South Africa

    e-mail : neil@bex.co.za

    Originally posted by dahalkar_p
    Hi All,
    We have a table in our database (MS SQL) where in no primary key defined.
    Table name is NAMELINK.
    Structure is as follows :
    NameId1 int 4 not null
    Category char 5 nullable
    Relation12 char 20 nullable
    NameId2 int 4 not null
    Relation21 char 21 nullable
    Remarks char 30 nullable
    Dependent smallint 2 not null

    While conversion we are assuming that NameId1, NameId2 is a
    composite primary key (looking at the data majorly it is unique). But
    still there are some cases where it can not be unique.
    Now we want a select query to fetch only those records where the
    combination of NameId1, NameId2 is unique. We tried self join but
    somehow it's not working.
    Please help about this.

    Regards,
    Prashant

  9. #9
    Join Date
    Oct 2001
    Location
    Bangalore
    Posts
    186
    Hi bex,

    This will return all the rows in the table..
    I think you have a confusion on the way the question is asked.
    It's not just duplicate in both the tables.
    It's combination should not repeat even vice versa.

    Nameid1 Nameid2
    1 2
    2 1

    Here i need any one record.

    -Prashant
    Prashant

  10. #10
    Join Date
    Dec 2002
    Location
    Czech Republic
    Posts
    249

    Re: Complex query problem, help needed

    --assume invalid PK NameId1, NameId2 <-> NameId2, NameId1

    --without duplicities
    select n1.*
    from NAMELINK n1
    where 1=(
    select count(*) from NAMELINK n2 where
    (n1.NameId1=n2.NameId1 and n1.NameId2=n2.NameId2)
    or
    (n1.NameId1=n2.NameId2 and n1.NameId2=n2.NameId1)
    )
    order by NameId1, NameId2

    --only duplicities, correctly ordered
    select n1.*
    from NAMELINK n1
    inner join NAMELINK n2
    on n1.NameId1=n2.NameId1 and n1.NameId2=n2.NameId2
    where 1<(
    select count(*) from NAMELINK n2 where
    (n1.NameId1=n2.NameId1 and n1.NameId2=n2.NameId2)
    or
    (n1.NameId1=n2.NameId2 and n1.NameId2=n2.NameId1)
    )
    order by
    case when n1.NameId1>n1.NameId2
    then n1.NameId2
    else n1.NameId1
    end
    ,case when n1.NameId1>n1.NameId2
    then n1.NameId1
    else n1.NameId2
    end
    Last edited by ispaleny; 12-25-02 at 13:25.

  11. #11
    Join Date
    Oct 2001
    Location
    Bangalore
    Posts
    186
    Hi,

    How can we delete the duplicates in the similar scenario?

    Regards,
    Prashant

  12. #12
    Join Date
    Dec 2002
    Location
    Czech Republic
    Posts
    249
    --Try this:

    -- Coping source table, adding PK "Id"
    select "Id"=IDENTITY(int,1,1),*
    into dbo.TempNameLink
    from dbo.NameLink
    order by
    case when NameId1>NameId2
    then NameId2
    else NameId1
    end
    ,case when NameId1>NameId2
    then NameId1
    else NameId2
    end
    GO
    -- accelerating by indexes on computed columns
    alter table dbo.TempNameLink
    add "compNameId1" as
    case when n2.NameId1>n2.NameId2
    then n2.NameId2
    else n2.NameId1
    end
    ,"compNameId2" as
    case when n2.NameId1>n2.NameId2
    then n2.NameId1
    else n2.NameId2
    end
    GO
    create unique clustered index IC_TempNameLink on TempNameLink ("compNameId1","compNameId2","Id")
    GO
    alter table dbo.TempNameLink
    add constraint PK_TempNameLink
    primary key nonclustered ("Id")
    GO

    --inserting unique values
    --Information from Category,Relation12,Relation21,Remarks,Dependent columns in duplicities is lost.
    create table dbo.NewNameLink (
    NameId1 int not null
    ,NameId2 int not null
    ,Category char(5) null
    ,Relation12 char(20) null
    ,Relation21 char(21) null
    ,Remarks char(30) null
    ,Dependent smallint not null
    ,constraint PK_NewNameLink
    primary key ( NameId1,NameId2 )
    ,constraint CK_NewNameLink
    check ( NameId1<NameId2 )
    )
    GO
    insert dbo.NewNameLink(NameId1,NameId2,Category,Relation1 2,Relation21,Remarks,Dependent)
    select n1.NameId1,n1.NameId2,n1.Category,n1.Relation12,n1 .Relation21
    ,n1.Remarks,n1.Dependent
    from dbo.TempNameLink n1
    join (
    select "Id"=min("Id")
    from dbo.TempNameLink n2
    group by n2."compNameId1",n2."compNameId2"
    ) XXX on n1."Id"=XXX."Id"

  13. #13
    Join Date
    Oct 2001
    Location
    Bangalore
    Posts
    186
    Thank's for your help but Sorry to say , this will eliminate all the duplicates.

    We want to keep one of the duplicate and eliminate the rest of the values.

    I think i was not clear in my question.

    Regards,
    Prashant

  14. #14
    Join Date
    Oct 2001
    Location
    Bangalore
    Posts
    186
    All,

    Here's the fix for the complex situation:

    select "Id"=IDENTITY(int,1,1),*
    into dbo.TempNameLink
    from dbo.NameLink
    order by
    case when NameId1>NameId2
    then NameId2
    else NameId1
    end
    ,case when NameId1>NameId2
    then NameId1
    else NameId2
    end
    GO

    CREATE TABLE [NameLink1] (
    [NameID1] [int] NOT NULL DEFAULT (0),
    [Category] [char] (5) COLLATE SQL_Latin1_General_CP1_CI_AS NULL DEFAULT (''),
    [Relation12] [char] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL DEFAULT (''),
    [NameID2] [int] NOT NULL DEFAULT (0),
    [Relation21] [char] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL DEFAULT (''),
    [Remarks] [char] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL DEFAULT (''),
    [Dependent] [smallint] NOT NULL DEFAULT (0),
    [UpdateDate] [datetime] NULL ,
    [UpdatedByID] [int] NOT NULL DEFAULT (0),
    [rowguid] uniqueidentifier ROWGUIDCOL NOT NULL DEFAULT (newid())
    ) ON [PRIMARY]
    GO

    CREATE procedure CLEANLINK(@tempID integer output) as
    begin
    declare @Nameid1 integer,
    @Nameid2 integer,
    @ID integer,
    @COUNTER integer,
    @TOTAL integer
    declare TEMPCURSOR cursor dynamic scroll for select N.ID, N.NameID1, N.NameID2 from TEMPNAMELINK as N
    select @TOTAL = COUNT(*) from TEMPNAMELINK
    select @COUNTER=1
    open TEMPCURSOR
    while @COUNTER <= @TOTAL
    begin
    fetch next from TEMPCURSOR into @ID, @Nameid1, @Nameid2
    Begin
    IF not exists (Select 1 from namelink1 where Nameid1 = @Nameid1 and Nameid2 = @Nameid2)
    Begin
    INSERT INTO dbo.NameLink1(NameID1, Category, Relation12, NameID2, Relation21, Remarks, Dependent, UpdateDate, UpdatedByID, rowguid) SELECT NameID1, Category, Relation12, NameID2, Relation21, Remarks, Dependent, UpdateDate, UpdatedByID, rowguid from TEMPNAMELINK where nameid1 = @Nameid1 and nameid2 = @Nameid2 and ID = @ID
    end
    end
    select @COUNTER=@COUNTER+1
    end
    DEALLOCATE TEMPCURSOR
    end
    GO

    execute cleanlink 1
    go

    delete from NAMELINK
    go

    INSERT INTO dbo.NameLink(NameID1, Category, Relation12, NameID2, Relation21, Remarks, Dependent, UpdateDate, UpdatedByID, rowguid)
    SELECT NameID1, Category, Relation12, NameID2, Relation21, Remarks, Dependent, UpdateDate, UpdatedByID, rowguid from NAMELINK1
    go

    drop table NAMELINK1
    go

    DROP TABLE TEMPNAMELINK
    go

    DROP PROCEDURE CLEANLINK
    go

    This script does it all.

    Thanks to all,
    Prashant

  15. #15
    Join Date
    Dec 2002
    Location
    Czech Republic
    Posts
    249
    --So simply

    select "Id"=IDENTITY(int,1,1),NameID1, Category, Relation12, NameID2, Relation21, Remarks, Dependent, UpdateDate, UpdatedByID, rowguid
    into dbo.TempNameLink
    from dbo.NameLink
    order by NameId1,NameId2
    GO
    create clustered index IC_TempNameLink on dbo.TempNameLink (NameId1,NameId2)
    GO
    alter table dbo.TempNameLink
    add constraint PK_TempNameLink primary key ("Id")
    GO
    delete dbo.NameLink
    insert dbo.NameLink(NameID1, Category, Relation12, NameID2, Relation21, Remarks, Dependent, UpdateDate, UpdatedByID, rowguid )
    select NameID1, Category, Relation12, NameID2, Relation21, Remarks, Dependent, UpdateDate, UpdatedByID, rowguid
    from dbo.TempNameLink t
    join (
    select "Id"=min("Id")
    from dbo.TempNameLink
    group by NameID1, NameID2
    ) XXX on t."Id"=XXX."Id"
    drop table dbo.TempNameLink

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •