Results 1 to 9 of 9
  1. #1
    Join Date
    Jan 2004
    Posts
    66

    Unanswered: Problem in deleting duplicate rows

    Hi,

    Following is my De-Dup(Delete Duplicate rows) query . It deletes duplicate rows in ph_extract_tmp1 table on the basis of where condition. Now for 122 millions of rows , it's getting hanged . So can I use any analytical function to solve this problem ? OR is there any other way for De-Dup the records from this table ?

    ph_extract_tmp1 table is properly partitioned and indexed .

    DELETE ph_extract_tmp1 A
    WHERE EXISTS
    (
    SELECT /*+ USE_HASH(A,B) */
    1
    FROM
    ph_extract_tmp1 B
    WHERE
    A.aol_indv_id = B.aol_indv_id
    AND NVL(A.aol_day_ph_nb,'0') = NVL(B.aol_day_ph_nb,'0')
    AND NVL(A.aol_ngt_ph_nb,'0') = NVL(B.aol_ngt_ph_nb,'0')
    AND NVL(A.csi_day_ph_nb,'0') = NVL(B.csi_day_ph_nb,'0')
    AND A.ROWID > B.ROWID
    )


    Thanks in advance ....
    himridul

  2. #2
    Join Date
    Dec 2003
    Location
    Noida, India.
    Posts
    171
    Try this, if this workd better

    Code:
    delete from ph_extract_tmp1
    where rowid not in 
    	(select min(rowid) 
    	from ph_extract_tmp1 
    	group by aol_indv_id, NVL(A.aol_day_ph_nb,'0'),NVL(A.aol_ngt_ph_nb,'0'), NVL(A.csi_day_ph_nb,'0')	
    	)
    Oracle can do wonders !

  3. #3
    Join Date
    Dec 2003
    Location
    Noida, India.
    Posts
    171
    Sorry for syntax, i've not corrected that.

    Originally posted by cmasharma
    Try this, if this workd better

    Code:
    delete from ph_extract_tmp1
    where rowid not in 
    	(select min(rowid) 
    	from ph_extract_tmp1 
    	group by aol_indv_id, NVL(aol_day_ph_nb,'0'),NVL(aol_ngt_ph_nb,'0'), NVL(csi_day_ph_nb,'0')	
    	)
    Oracle can do wonders !

  4. #4
    Join Date
    Jan 2004
    Posts
    66
    cmasharma ,

    For 0.2 million of data , query cost is giving 43 (first it was 18) .
    himridul

  5. #5
    Join Date
    Feb 2004
    Location
    Chennai
    Posts
    53

    Re: Problem in deleting duplicate rows

    hi,

    This would help u.

    delete from ph_extract_tmp1 where rowid in
    (select a from
    (select min(rowid) a
    from ph_extract_tmp1
    group by aol_indv_id, NVL(aol_day_ph_nb,'0'),NVL(aol_ngt_ph_nb,'0'), NVL(csi_day_ph_nb,'0')) b
    right outer join
    (select rowid a from ph_extract_tmp1 ) c
    on b.a=c.a
    where b.a is null)
    -Mathan
    For a quick pocket reference of oracle refer http://pocketoracle.blogspot.com/

  6. #6
    Join Date
    Feb 2004
    Location
    Chennai
    Posts
    53

    Re: Problem in deleting duplicate rows

    a small corr... (select c.a)

    delete from ph_extract_tmp1 where rowid in
    (select c.a from
    (select min(rowid) a
    from ph_extract_tmp1
    group by aol_indv_id, NVL(aol_day_ph_nb,'0'),NVL(aol_ngt_ph_nb,'0'), NVL(csi_day_ph_nb,'0')) b
    right outer join
    (select rowid a from ph_extract_tmp1 ) c
    on b.a=c.a
    where b.a is null)




    Originally posted by getmathan
    hi,

    This would help u.

    delete from ph_extract_tmp1 where rowid in
    (select a from
    (select min(rowid) a
    from ph_extract_tmp1
    group by aol_indv_id, NVL(aol_day_ph_nb,'0'),NVL(aol_ngt_ph_nb,'0'), NVL(csi_day_ph_nb,'0')) b
    right outer join
    (select rowid a from ph_extract_tmp1 ) c
    on b.a=c.a
    where b.a is null)
    -Mathan
    For a quick pocket reference of oracle refer http://pocketoracle.blogspot.com/

  7. #7
    Join Date
    Jan 2004
    Posts
    66
    getmathan ,

    Actually your query is giving cost 53 for 0.2 millions of rows .
    I need cost around 10 max.Previously my original query was giving cost 18.


    Anyway Thank u .....
    himridul

  8. #8
    Join Date
    Mar 2002
    Location
    Reading, UK
    Posts
    1,137
    Dont forget execution plan cost is an estimate and isnt always accurate. You need to run the statement and look at the sql trace file. Try copying the tables to your test environment, analyze them and then see which one is the best.

    Alan

  9. #9
    Join Date
    Feb 2004
    Location
    India
    Posts
    135
    Hi all,

    the easiest method to delete the duplicate records from the table is by using rowid.

    Delete from where rowid not in (select max(rowid) from group by ;

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •