Results 1 to 4 of 4
  1. #1
    Join Date
    Apr 2006
    Location
    System.Web.UI.WebControls.Home.BedRoom2
    Posts
    42

    Lightbulb Unanswered: optimization challenge

    Well i wanted to prove to some guys that cursors are not really that important .
    So this code is suppose to remove duplicate tuples from a table without temporary tables or cursors. Except it needs some optimization(and alot of system down time, not sure about that ).
    I would like it, if some one could find an instance of the table when the below code fails or some way to optimize the code or anything .

    --trashtable for real data
    create table abc
    (col1 tinyint,
    col2 tinyint,
    col3 tinyint)

    --trash values for trash table
    insert into abc values (1,1,1)
    insert into abc values (1,1,1)
    insert into abc values (1,1,1)
    insert into abc values (1,1,1)
    insert into abc values (2,2,2)
    insert into abc values (2,2,2)
    insert into abc values (2,2,2)
    insert into abc values (3,2,1)
    insert into abc values (2,2,3)
    insert into abc values (3,2,4)

    --check that there are ten rows
    select * from abc
    --check that there are only five distinct rows
    select distinct * from abc

    --run code : next 15 line as a batch
    declare @lp tinyint
    declare @col1 tinyint,@col2 tinyint,@col3 tinyint
    set @lp=1
    while @lp>0
    begin
    if not exists (select top 1 * from abc group by col1,col2,col3 having count(col1)>1)
    set @lp=0
    else
    begin
    select top 1 @col1 = col1,@col2 = col2,@col3 = col3 from abc group by col1,col2,col3 having count(col1)>1
    delete from abc where col1=@col1 and col2=@col2 and col3=@col3
    insert into abc values(@col1,@col2,@col3)
    end
    end

    --only distinct values left in trash table
    select * from abc

    --think code can be optimized
    --just wanted to prove: can be done without cursors or temporary tables

  2. #2
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    I know this is a cheat and I'm not exactly rising to the challenge however there is a pretty good discussion about (and links to) removing dupes here:
    http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=6256

    HTH
    Testimonial:
    pootle flump
    ur codings are working excelent.

  3. #3
    Join Date
    Apr 2006
    Location
    System.Web.UI.WebControls.Home.BedRoom2
    Posts
    42

    Lightbulb

    Thanks

    Had a look at the URL.
    another cool way is to use SET ROWCOUNT with DELETE.

  4. #4
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    While I realize that this uses a temp table, the usage is quite small and this is pretty efficient:
    Code:
    --trashtable for real data
    create table abc
    (  col1		tinyint
    ,  col2		tinyint
    ,  col3		tinyint)
    
    --trash values for trash table
    insert into abc values (1,1,1)
    insert into abc values (1,1,1)
    insert into abc values (1,1,1)
    insert into abc values (1,1,1)
    insert into abc values (2,2,2)
    insert into abc values (2,2,2)
    insert into abc values (2,2,2)
    insert into abc values (3,2,1)
    insert into abc values (2,2,3)
    insert into abc values (3,2,4)
    
    --check that there are ten rows
    select * from abc
    
    --check that there are only five distinct rows
    select distinct * from abc
    
    create table ptp_dupes
    (  col1		tinyint
    ,  col2		tinyint
    ,  col3		tinyint)
    
    INSERT INTO ptp_dupes (col1, col2, col3)
    SELECT col1, col2, col3
       FROM abc
       GROUP BY col1, col2, col3
       HAVING 1 < Count(*)
    
    BEGIN TRANSACTION
    
    DELETE FROM abc
       WHERE EXISTS (SELECT *
          FROM ptp_dupes
          WHERE  ptp_dupes.col1 = abc.col1
             AND ptp_dupes.col2 = abc.col2
             AND ptp_dupes.col3 = abc.col3)
    
    INSERT INTO abc (col1, col2, col3)
       SELECT col1, col2, col3
          FROM ptp_dupes
    
    COMMIT TRANSACTION
    
    SELECT          col1, col2, col3 FROM abc ORDER BY col1, col2, col3
    
    SELECT DISTINCT col1, col2, col3 FROM abc ORDER BY col1, col2, col3
    Unfortunately, there isn't anything I can recommend as even close to efficient that doesn't use a temp table at all. The cursor solutions are inefficient, and the code that you've shown is interesting, but not very efficient.

    -PatP

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •