Results 1 to 4 of 4
  1. #1
    Join Date
    Mar 2015
    Posts
    31

    Unanswered: VBA Delete Duplicates for 2 tables

    Hello everyone,

    I'm looking for your help because my database is swimming with duplicates and i haven't been able to dispose of them by using alternative methods such as using Criteria in Queries. So i have been obliged to use VBA which i have never used before.

    I want to delete duplicates from 2 tables:
    - 011- Extraction FA ENSEMBLE
    - 010- ENSEMBLE before table

    All my tables are bound by a Key which is called "Key". However this key is not enough in order to precisely deal with the duplicates. So in order to deal with them i suppose I have to create a subquery using the following field "MaxDeN°BL" (field which indicates the Package number of the last sent item).However, the records containing empty fields musn't be deleted !!

    I gave it a shot with only one table with the following however it is my first time at trying VBA so it didn't work

    DELETE FROM 011- Extraction FA ENSEMBLE
    WHERE Key <> (SELECT Min (Key) AS MinOfKey FROM 011- Extraction FA ENSEMBLE AS Dupe
    WHERE (Dupe.Key = 011- Extraction FA ENSEMBLE.Key)
    AND (Dupe.MaxDeN°BL = 011- Extraction FA ENSEMBLE.MaxDeN°BL));

    Any help will be deeply appreciated !
    Last edited by Guy Winfield; 05-07-15 at 06:16.

  2. #2
    Join Date
    Nov 2004
    Location
    out on a limb
    Posts
    13,692
    Provided Answers: 59
    OK for me there are two issues here
    first and most important is diagnose why you are getting duplicates and designing those duplicates out of the system. whether you do that through primary keys, or unique indexes ( a primary key is also a unique index, there can be many unique indexes on a table,m but only EVER one primary key)

    having fixed the root cause then its time to go through your data and remove the duplicates (whether you actually delete the duplicates or flag them as hidden is your design call. you could be a smart aleck and flag as hidden at first, then once you know the data is right delete 'em.

    the real problem with duplicate data is knowing which is/are the right row(s) to delete and which to keep.


    in terms of your current query
    1) its garbage, you cannot have two uses of WHERE in the same query
    2) you are using reserved words or symbols in table and column names

    primarily this is using spaces, but you also use the reserved word key. add top that list the hyphen
    Access can do a reasonable job fixing these errors on your behalf, but that means you are relying on Access to do that, and frankly sometimes it just throws its hands up in horror and sulks
    as Myle has already said here you should delimt these bastardised table and column names using square brackets eg:-
    Code:
    ...[011- Extraction FA ENSEMBLE].[Key]....
    instead of
    Code:
    ...011- Extraction FA ENSEMBLE.Key...
    OK so how might this process work
    well the first step is to identify the duplicates

    use a query to COUNT the number of rows sharing the same value (whatever column / columns) that identifes rows and should be unique
    eg
    Code:
    select [key] from [011- Extraction FA ENSEMBLE] 
    where count[key] > 1
    ..so that will identify where the column key has more than one row for each value

    use that query as a feedstock into your next query... for now, instead of using delete develop your query using select... once you have proved the select is pulling the right rows to be deleted then you can quickly change it to delete. but bear in mind its always a smart call to do this sort of operation on a copy of the original data.. prove everythign works, and then after you've doen that apply those changes to the live copy of the data and copy that back over the original once you know the changes are right. learn to love backups, in in situations backups are a developers best friend, but if you are hacking live data backups are more than a developers best friend and you can't have enough of them (backups that is, not necessarily friends)). also prove your backups are 'good' before making changes. you dont' want to screw somethign up, then find your backup is compromised.

    In order to delete the duplicates you need to know how to identify the redundant rows. something only you can identify....
    it could be that you have used an autonumber column as the primary key in place of what should be the 'right' key. if so you need to decide if the first (lowest) or last (highest autonumber for each duplicate is the one to keep. if the answer to that is it all depends then unless you can clearly identify the row to be deleted you will have to do this as a manual process

    bus the process MUST be
    identiofy why you are gettign duplicates
    eliminate that reason
    then fix your data

    just fixing the data means you will have to do this process again and again and again.
    I'd rather be riding on the Tiger 800 or the Norton

  3. #3
    Join Date
    Mar 2015
    Posts
    31
    First of all thank you very much for taking time to answer with a very detailed explanation !

    I'm going to add in a bit of info from what you said in your dissertation.
    I do know where the Duplicates are coming from and why they are being duplicated but the problem is that i have to import numerous files which all already contain duplicates, something which is beyond my control unfortunately. So i can restrict these duplicates from duplicating again (like bloody rabbits).

    They Key that i have comes from a formula that i made using the order number + production line. This gives me a unique key for all of my orders. However, one order may not be shipped just once, it can be partially shipped 3 times but for 1 order, this creates the duplicates. So upon that basis, i know exactly which lines are to be deleted/hidden and why they are here.

    That brings me to the field I was talking about earlier on "MaxDeN°BL" which indicates the package number of the last sent item from the order, that would then only show the most updated version of that order's delivery and order number so the previous records would be duplicates and pretty much useless.

    As for working with saved files, i couldn't agree more and i'm currently working on a saved file with another backed up saved file, after losing one database i got the message


    I am using a French version of access so i'm not sure if the "Key" will be an issue ? and as for the hyphen ? And i should create 2 queries by the looks of it then ?

    Thanks again !

  4. #4
    Join Date
    Nov 2004
    Location
    out on a limb
    Posts
    13,692
    Provided Answers: 59
    yes you can always control the data coming into your system. there is never any reason to accept duplicated data, within a system designed and controlled by you.

    how you stop that happening is up to you

    you could import the data into temporary tables, then use 2 queries per table.
    one query inserts new rows where the relevant PK doesn't exist
    the other query updates existing rows, where the PK already exists
    ...at no time do you ever allow duplicates to enter the system.
    ...that could be done as a batch process overnight, requiring no user intervention.

    key is a reserved word in Access, there is no indication that the French version of Access/JET woudl use clef in place of keyt, so trreat key as a reserved word, unless there is a separate page from Microsoft identifying a different list of reserved words in French.As Key is a reserved word in ANSI SQL as well I very much doubt that key will not be a reserved word in any language variant of Access
    I'd rather be riding on the Tiger 800 or the Norton

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •