Results 1 to 5 of 5
  1. #1
    Join Date
    May 2008

    Unanswered: Remove Duplicates based on 1 field

    I have read old posts related to removing duplicates but none of them seemed to help me.

    I am basically using the design mode to create this query. The Query pulls data from 2 tables ('All shipments' and 'ACT') and the primary key is 'Part Num'. There are several other fields as well but they are predominatly unique for each repeated 'Part Num'

    For each table, I just want the query to use the record from the first occurence of any particular 'Part Num' and discard latter record of a repeated 'Part Num'. This is the basic query that I have so far (displays 100s of repeated rows and blanks fields too):

    [All Shipments].[Product Size],
    [All Shipments].[Firing Lot],
    [All Shipments].[Ship ate],
    [All Shipments].Customer,
    [ACT].[Height diff],
    [ACT].[Parallelism E-C-E]
    [ACT].[Centerline Perp]
    FROM [ACT] INNER JOIN [All Shipments]
    ON [ACT].[Part Num] = [All Shipments].[Part Num];

    Because both tables have 'Part Num' duplicates (the rest of the fields are unique), the query pull 100s of duplicate rows and does not work. When I manually remove all duplicates from each table, it works perfectly. The problem is I will be adding in new data periodically and it's a lot of data and so I need an automatic way of doing this as I run the query.

    Any suggestions will be appreciated!

  2. #2
    Join Date
    Dec 2004
    Madison, WI
    Removing duplicate from 2 different tables with multiple records having the same value in a field which the 2 queries are joined on can be tricky, especially if you're not using code to walk through the records. I had an old, old post on some techniques to do this but I'll try to recall some suggestions here.

    1. You can create a query for each table which has an expression column which concanenates 2 or more fields together (i.e. NewPriKey: [Part Num] & [Ship Date].) Then use these 2 new queries to either make tables and rejoin back in an additional query to delete or just use the 2 new queries in another query to delete (ie. nested query(s)), joining on the new concanenated field.

    2. You mentioned first occurence which could be problematic. You could try an initial query (again, using nested queries as in 1 above) where you utilize the First or Min (ie. sum query) and then either make a table of these values or use it in another query. It's difficult to explain without seeing the data and actually designing a routine.

    You may also want to establish some sort of "Unique" primary key in the table(s) as it sounds as though Part Num is not unique for the records. Again, you could do this on the form or in code combining values of 2 or more fields when entered/inserted/updated. This would ultimately make it easier in your deleting queries of duplicate values. Otherwise you're dealing with make table queries or nested queries to remove duplicates. The key is to first get a set of unique records from each table in the query and then join in another query those queries. If you have duplicates from joining 2 tables, removing duplicates is very difficult (if able to) on deleting records which you don't want to delete which are duplicated (especially in 1 query.) As you've probably discovered, using Distinct doesn't make it doable with a delete type query.
    Last edited by pkstormy; 07-11-08 at 21:59.
    Expert Database Programming
    MSAccess since 1.0, SQL Server since 6.5, Visual Basic (5.0, 6.0)

  3. #3
    Join Date
    May 2008
    Thanks! I will try to do what you have suggested. I forgot to also mention another problem within this mix-

    I am linking the original tables to excel files where the PART NUM column (COLUMN A) is populated on a concatenate formula, which goes all the way down to row 300, even though the other data (COLUMNS B- AA) only show data up to row 150. This was done like so in order for one to put data into COUMNS B- AA and have the PART NUM automatically populated in COLUMN A. Anyways, I link these sheet in access and this is causing the linked table to have rows up to 300 with many of them blank. So when I join this table with another (which has the same formula problem), I get hundreds of blank rows.

    Is there a way to resolve without deleting the formulas in the original Excel sheets....


  4. #4
    Join Date
    Nov 2007
    Adelaide, South Australia
    Only skimmed this so forgive me if this is way off, but can you create queries to extract only the valid data you need and then link with those queries?
    Owner and Manager of
    CypherBYTE, Microsoft Access Development Specialists.
    Microsoft Access MCP.
    And all around nice guy!

    "Heck it's something understood by accountants ... so it can't be 'that' difficult..." -- Healdem
    "...teach a man to code and he'll be frustrated for life! " -- georgev

  5. #5
    Join Date
    May 2008
    Thank you for your suggestions!

    I figured that the main issue was with having non-unique records by the primary key (PART NUM). I also eliminated the blank concatenate formula rows in the linked excel sheets. Now, I could perform a UNION on all the seperate tables (ACT1, ACT2,...ACT16) without having blank repeated records and then created queries to pull the specific data I wanted. The final step was to create a query that joined he 'united' ACT tables with the [All shipments] table by PART NUM

    Not the perfect solution but this works!
    Thanks again!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts