Results 1 to 4 of 4
  1. #1
    Join Date
    Jul 2011
    Posts
    5

    Unanswered: need help managing duplicate rows

    Hi All,

    I am pretty new in databases. I have a project that uploads tab delimited files using powerbuilder and ms sql server. I have problems managing duplicates rows. this duplicates does not go away because I don't have the right script. Below are my scripts so far:

    INSERT INTO TMP_FLOWTHRU_2
    SELECT DISTINCT * FROM TMP_FLOWTHRU
    WHERE APS_NO IN
    (SELECT DISTINCT TMP_FLOWTHRU.APS_NO FROM (
    select APS_NO, COUNT(*) AS CNT from TMP_FLOWTHRU
    GROUP BY APS_NO
    HAVING COUNT(*)>1) TMP_FLOWTHRU);

    on TMP_FLOWTHRU_2, there should be a row that should be copied like the ones below (I prefer keeping the newest if possible):

    APS_NO VERI_COMPLETED_DATE
    71311002584 2011-07-13 17:03:00.000
    71311002584 2011-07-13 17:04:00.000

    below is my delete script:
    DELETE TMP_FLOWTHRU
    WHERE APS_NO IN (SELECT DISTINCT TMP_FLOWTHRU.APS_NO FROM (
    select APS_NO,COUNT(*) AS CNT from TMP_FLOWTHRU
    GROUP BY APS_NO
    HAVING COUNT(*)>1) TMP_FLOWTHRU);

    the two rows stay after this exercise.

    please help.

    if there is already a related thread to this, please attach this post.

    thanks.

  2. #2
    Join Date
    Mar 2007
    Location
    Holmestrand, Norway
    Posts
    332
    What you seek is something like this: How do I remove duplicate tuples from a relation without any candidate keys?. In your case parttion by and order by will be on different columns.
    Ole Kristian Velstadbråten Bangås - Virinco - MSSQL.no - Facebook - Twitter

  3. #3
    Join Date
    Jul 2011
    Posts
    5
    thanks for the reply.

    here's what I did, partition by and order by are different columns:
    DELETE FROM TMP_FLOWTHRU
    WHERE APS_NO IN (SELECT DISTINCT APS_NO FROM
    (select APS_NO,COUNT(*) AS CNT from TMP_FLOWTHRU
    GROUP BY APS_NO
    HAVING COUNT(*)>1) TMP_FLOWTHRU);
    WITH duplicates as (
    select APS_NO, row_number() over (partition by mobile_number order by account_number) as rowno
    from TMP_FLOWTHRU
    ) delete from duplicates where rowno > 1;

    both rows got deleted. I need one row to stay.

    I tried another insert to TMP_FLOWTHRU to keep one row of the duplicates:
    INSERT INTO TMP_flowthru
    SELECT * FROM TMP_FLOWTHRU_2
    where aps_no not in
    (SELECT DISTINCT APS_NO FROM TMP_flowthru);

    but the same rows below:
    APS_NO VERI_COMPLETED_DATE
    71311002584 2011-07-13 17:03:00.000
    71311002584 2011-07-13 17:04:00.000

    keeps on being inserted. i just need one.
    Last edited by furei1976; 08-01-11 at 06:59.

  4. #4
    Join Date
    Jul 2011
    Posts
    5
    I tried this and it only copies one duplicate discarding the others.

    SELECT * FROM TMP_flowthru_2;

    WITH CTE AS (
    SELECT RN=ROW_NUMBER() OVER (PARTITION BY mobile_number
    ORDER BY account_name)
    FROM TMP_FLOWTHRU_2)
    DELETE CTE
    WHERE RN > 1;

    INSERT INTO TMP_flowthru
    SELECT * FROM TMP_flowthru_2;
    Last edited by furei1976; 08-01-11 at 12:15. Reason: not working okay

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •