Results 1 to 2 of 2
  1. #1
    Join Date
    Jun 2011
    Inside your mind

    Unanswered: SQL - Remove duplicates when other fields are equal

    Hi guys,

    Having a spot of bother writing this in SQL. I can achieve the results I'm after using VBA, but it will run much slower than a suitable SQL statement.

    I have a table that looks like this:

    Table  - tmp_orders_messy
    Fields - Order Number
           - Product vName
           - Quantity
    In this table, I want to do the following:

    1) Find all the duplicate [Product vName]s, ONLY when they're associated with the same [Order Number].
    2) Once found, I want to sum the [Quantity] fields for those values.
    3) Then I want to either:
    a) Delete all the duplicate records found, and create a new one with the same details, except for the modification of the [Quantity] field.
    b) Modify one of the duplicates to contain the summed [Quantity], and delete all the other duplicates.

    To better explain this, let me give you an example...

     Order Number | Product vName | Quantity 
          900     |  Set  Square  |     2
          900     |   Blue  Hat   |     1
          900     |  Set  Square  |     4
          940     |   Blue  Hat   |     3
          940     |  Set  Square  |     1
    Is the data held in tmp_orders_messy, and the SQL statement should reduce it to it:

     Order Number | Product vName | Quantity 
          900     |  Set  Square  |     6
          900     |   Blue  Hat   |     1
          940     |   Blue  Hat   |     3
          940     |  Set  Square  |     1

    Like I said, using a combination of loops, ifs and an array I can achieve this with VBA, but its pretty slow when its checking through 40,000+ records. I'm hoping that a fairly sophisticated SQL statement will be much faster.

    Any ideas?

    Thanks in advance guys.
    Looking for the perfect beer...

  2. #2
    Join Date
    Mar 2009
    Provided Answers: 15
    This SELECT query will provide the expected result:
    SELECT   Tmp_orders_messy.[Order Number], 
             Tmp_orders_messy.[Product vName], 
             Sum(Tmp_orders_messy.Quantity) AS Quantity
    FROM     Tmp_orders_messy
    GROUP BY Tmp_orders_messy.[Order Number], 
             Tmp_orders_messy.[Product vName];
    You can use it as the SELECT part of an INSERT query if you want, but I don't see any reason why you could not keep the table as it is and use the query instead, specially if its a temporary table which the Tmp_ prefix seems to indicate.
    Have a nice day!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts