Page 1 of 2 12 LastLast
Results 1 to 15 of 17
  1. #1
    Join Date
    Aug 2007
    Posts
    88

    Question Unanswered: VBA code: how to detect duplicate record in text file and prompt for deletion?

    Hi Gurus,

    I am pretty new to Access VBA and desperately need your guidence to resolve this problem.

    My company has a test machine to test sample quality and write the test result values to a text file like the following without the header.

    Essentially, the combination of "test criteria", "sampleno", "endno", and "test type" should be a unique value.

    When the tester connect one end of a sample to the machine, finish testing, he clicks Save button on the screen, the machine will write one line to the text file. But some times, the tester may make mistake and need to test twice for this end of sample for the same test criteria and same test type, the machine can not tell and will write a new line, which cause duplicate records like the following example: test criteria: 1311, Sampleno: 4, endno: 2, and test type: Relative Power; This appear 2 lines and it will take a lot of works to clean up after all of the samples are tested. The tester often forgot which line is bad and should be deleted. The tester can not see the file until the all the testing are done.

    Is any way to write a access vba code to read the text file (while the unknown program in the test machine is writing data in the text file) to detect the duplication right away and prompt a interface to display the duplicate record and let the tester to selet which one to delete, then the program will automatically delete the duplicate record immediately without waiting the testing are done?

    When the testing is going, the number of lines is increasing. I want the access program can on going detect the duplication and prompt for deleting right away. Can it be done by using windows scheduled tasks?

    Is it reachable?

    Thank you so much! Any input will be greatly appreciated.


    Time test Sampleno endno test type result na
    criteria

    10:23 AM 1311 4 1 Relative Power -0.07 4
    10:23 AM 1558 4 1 Relative Power -0.06 4
    10:23 AM 1484 4 1 Relative Power -0.13 4
    10:23 AM 1626 4 1 Relative Power -0.08 4
    10:23 AM 1626 4 1 Back Reflection -80.00 4
    10:23 AM 1484 4 1 Back Reflection -76.10 4
    10:24 AM 1558 4 1 Back Reflection -79.90 4
    10:24 AM 1311 4 1 Back Reflection -75.20 4
    10:25 AM 1311 4 2 Relative Power -0.24 4
    10:26 AM 1558 4 2 Relative Power -0.13 4
    10:26 AM 1311 4 2 Relative Power -0.17 4
    10:27 AM 1484 4 2 Relative Power -0.19 4
    10:27 AM 1626 4 2 Relative Power -0.12 4
    10:27 AM 1626 4 2 Back Reflection -80.00 4
    10:27 AM 1484 4 2 Back Reflection -78.50 4
    10:27 AM 1311 4 2 Back Reflection -76.70 4
    10:27 AM 1558 4 2 Back Reflection -80.00 4

  2. #2
    Join Date
    May 2004
    Location
    New York State
    Posts
    1,178
    Hello, newguy, and welcome to Access and to the dbforum.

    I suggest that you add some code into the Save comand button that before the save is actually performed, VBA does a lookup in the table for a dupe, and if it detects one, prompt the user with both records and have the user choose which one to save.

    By the way, if the user decides to save the last one, you don't have to delete the previous entry (unless the time factor or the order of tests is critical). Simply overwrite the previous result with the current result and update the record.

    Hope this helps,
    Sam

  3. #3
    Join Date
    Aug 2007
    Posts
    88

    I don't have control on that save command button

    Quote Originally Posted by Sam Landy
    Hello, newguy, and welcome to Access and to the dbforum.

    I suggest that you add some code into the Save comand button that before the save is actually performed, VBA does a lookup in the table for a dupe, and if it detects one, prompt the user with both records and have the user choose which one to save.

    By the way, if the user decides to save the last one, you don't have to delete the previous entry (unless the time factor or the order of tests is critical). Simply overwrite the previous result with the current result and update the record.

    Hope this helps,
    Sam
    Sam,

    Thank you so much for response! Unfortunately, the save button program is an unknown, foreign program and I don't have control on it. It's not part of the Access VBA program I need to develop to help the tester.

    I am thinking to develop an access program run every 5 minutes to read the text file and write into a access table (tableA), create a query to detect the duplicate records then prompt the tester the duplication and he can decide which one he want to delete. I then can save the one he want to delete in another table (tableB). Finally, when the tests are done, I can compare the tableA and tableB to delete the duplicate records in tableA.

    My questions are :
    1. how to make the access program run every 5 minutes automatically without the tester intervene? Is there any time wait event in VBA to make it happen?
    2. How to write the vba code to make the program only load the new lines in the text file into the tableA when the access program's next run without loading the whole file again?

    Thanks again!

  4. #4
    Join Date
    May 2004
    Location
    New York State
    Posts
    1,178
    I've been confounded with this myself. In Access, it's far easier to detect a dupe and prevent it from being written to a table, then to play catch-and-delete-the-dupe after the fact. Especially when the user wants to save one, and for the user to tell Access which one to save. Good luck!

    Anybody?

    Sam

  5. #5
    Join Date
    May 2004
    Location
    New York State
    Posts
    1,178
    I had another thought as well. (I was in QA for many years, and in one company one of my chief responsibilities was Test Department documentation.) When a test was performed, especially if there was an anomaly, the tester was required to document it. Even if there was user error, the point was not to place blame, but to document what was done, for future traceability.

    What I'm suggesting, then, is that the tester have a log where he can record any anomalies, including his own.

    If the tester does start keeping such a log, then the procedure you're proposing can be run once, at the start - or end - of each session. He'll have the logbook in front of him to remind him of which result was correct and which was the result of an error.

    Sam

  6. #6
    Join Date
    Feb 2007
    Posts
    15
    Hello newguyinaccess

    I wrote the following code in vb6, buyt I guess with a few changes it will work fine in access.
    In short it will open two text files. One have the data that will be scanned for duplicated rows, the second file will receive only unique rows from the first one. I am askying for apologies in advance but I don't have, right now, time to translate the comments, but I think the code is pretty clear, nevertheless if you have any question feel free to ask, I will answer as soon as possible.

    OS

    Now the code

    Dim strLinha As String 'lê o conteúdo da linha
    Dim strLinhaCopia As String 'lê o conteúdo da linha de comparação
    Dim strFichOrig As String 'nome do ficheiro original
    Dim intVector As Integer 'percorre as posições do vector
    Dim bytLimInf As Byte 'valor do limite inferior do vector
    Dim bytLimSup As Byte 'valor do limite superior do vector

    Sub Cria_Ficheiros()
    If strFichOrig = "" Then
    MsgBox "Não existem (mais) ficheiros para analisar!", vbOKOnly + vbExclamation, "JBA - Importação"
    Exit Sub
    End If
    'Abre o ficheiro de texto em modo de leitura
    Open strCaminho & "\" & strFichOrig For Input As #1
    'No próximo loop vai contar quantas linhas tem o ficheiro
    intVector = 0
    Do Until (EOF(1) = True)
    Line Input #1, strLinha
    intVector = intVector + 1
    Loop
    'Define o vector com o número de linhas do ficheiro
    ReDim strVerifica(intVector)
    bytLimInf = 0
    bytLimSup = intVector - 1
    'Fecha o ficheiro para, ao abrir, voltar ao início
    Close #1
    'Abre o ficheiro original
    Open strCaminho & "\" & strFichOrig For Input As #1
    'Abre um ficheiro para escrever as linhas sem duplicados
    Open strCaminho & "\" & strFichOrig & ".fix" For Output As #2
    'Abre um ficheiro para escrever as linhas duplicadas
    Open strCaminho & "\" & strFichOrig & ".dup" For Output As #3
    intContador = 1
    Do Until (EOF(1) = True)
    'Coloca o conteúdo da linha actual em strLinha
    Line Input #1, strLinha
    'Verifica se é a primeira linha
    If intContador = 1 Then
    strVerifica(0) = strLinha
    intVector = 0
    'Não é a primeira linha
    Else
    'Percorre o ficheiro de comparação à procura de linhas iguais
    'Abre uma cópia do ficheiro original para comparação
    Open strCaminho & "\" & strFichOrig For Input As #4
    intContadorCopia = 1
    Do Until (EOF(4) = True)
    Line Input #4, strLinhaCopia
    'Encontrou uma linha igual
    If strLinhaCopia = strLinha Then
    'Verifica se estão ambos no mesmo número de linha
    '(o que significa que a linha não é duplicada)
    If intContadorCopia <= intContador Then
    'O conteúdo das linhas é igual e estão em linhas diferentes
    '(o que significa que a linha está duplicada)
    Else
    Print #3, strLinhaCopia; intContadorCopia
    End If
    'Não encontrou linhas iguais
    End If
    'Incrementa o contador do ficheiro de comparação
    intContadorCopia = intContadorCopia + 1
    Loop
    'Percorre o vector à procura de duplicados
    For intVector = 0 To intContador - 1
    'Se a posição do vector estiver vazia, então aceita o conteúdo
    If strVerifica(intVector) = "" Then
    strVerifica(intVector) = strLinha
    'Se o conteúdo do vector for igual à linha, ignora-a
    ElseIf strVerifica(intVector) = strLinha Then GoTo saida_duplicados:
    'Se o conteúdo do vector não for igual à linha
    'Verifica se a posição do vector está atrás da posição da linha
    ElseIf intVector < intContador Then
    End If
    Next
    saida_duplicados:
    Close #4
    End If
    intContador = intContador + 1
    Loop
    'escreve para o ficheiro sem duplicados
    intVector = 0
    Do Until strVerifica(intVector) = ""
    Print #2, strVerifica(intVector)
    intVector = intVector + 1
    Loop
    Close #3
    Close #2
    Close #1
    End Sub

  7. #7
    Join Date
    Aug 2007
    Posts
    88

    I can not make that suggestion to the tester. He will be unhappy.

    Quote Originally Posted by Sam Landy
    I had another thought as well. (I was in QA for many years, and in one company one of my chief responsibilities was Test Department documentation.) When a test was performed, especially if there was an anomaly, the tester was required to document it. Even if there was user error, the point was not to place blame, but to document what was done, for future traceability.

    What I'm suggesting, then, is that the tester have a log where he can record any anomalies, including his own.

    If the tester does start keeping such a log, then the procedure you're proposing can be run once, at the start - or end - of each session. He'll have the logbook in front of him to remind him of which result was correct and which was the result of an error.

    Sam
    Sam,

    I totally agree with your idea. You are absolutely right about the log. However, I am not in the position to suggest the tester to do the extra work. I am a developer only.

    Thanks for the suggestion.

    newguyinaccess

  8. #8
    Join Date
    Aug 2007
    Posts
    88

    Thanks for sharing the wonderful code

    Quote Originally Posted by osilvaware
    Hello newguyinaccess

    I wrote the following code in vb6, buyt I guess with a few changes it will work fine in access.
    In short it will open two text files. One have the data that will be scanned for duplicated rows, the second file will receive only unique rows from the first one. I am askying for apologies in advance but I don't have, right now, time to translate the comments, but I think the code is pretty clear, nevertheless if you have any question feel free to ask, I will answer as soon as possible.

    OS

    Now the code
    Hi OS,

    Thanks for willing to share the wonderful code. That's really nice. The issue I have is there is no duplicate line in the text file. The duplicate records I mentioned is the combination of 4 fields. Is there any way to detect the duplication for the combination of 4 fields not the entire line in the text file? Thanks again!

    newguy

  9. #9
    Join Date
    Feb 2007
    Posts
    15
    Hi there,

    My apologies, but I clearly understood your question right now.

    And YES! There is a quiet simple solution for your problem.

    Import your text file into an access table, then create a query that will group the records. You achive this creating a normal query then pulse the sum symbol button and verify that the new line (Total) have the value Group By in it. But remember... the query should only have the four fields you want to search for duplication, otherwise the first answer I gave you would be correct.

    HTH

    OS

  10. #10
    Join Date
    Nov 2004
    Location
    out on a limb
    Posts
    13,692
    Provided Answers: 59
    if you are running a batch process, under windows scheduler then is it reasonable to expect user intervention on error.

    In my experience you have few options
    either
    make assumptions about the error (ie is it an obvious erro that can be safely deleted /ignored
    or
    terminate processing and restore to a known good state, ideally a backup taken immediately before the batch process kicked off.. in an ideal world you could take a backup, work on the backup, if the process completes with no errors copy (or rename) the backup to the working backend.
    or
    isolate the records that have faults (similar to the process used to import records into access, and then let the user decide what they want to do with those records).

    some of that depends on what you want to do, how criticla the application is and how sensitive the data is.

    Id agree the 'best' way round your problem is to write each record to the table, check for an error and handle any error accordingly. ideally you should probably check to see if the record exists prior to writing, but if the unique items are part of the key, for performance reasons I'd be minded to write the record anyway, and trap for a duplicate record error

    it does mean having to write some vba to open the file, read through it sequentially and insert records into your table(s). but if its a regualr process I'd go down the VBA route.. call the function/macro as part of the command line switch that wakes access as part of your windows scheduler event
    I'd rather be riding on the Tiger 800 or the Norton

  11. #11
    Join Date
    Nov 2003
    Posts
    1,487
    Ahhh...I got bored so I made a simple sample (attached) of what I think you are wanting to do. It looks like a lot of code but really 90 percent of it is comments explaining what does what.

    With code you can do really do whatever you want.

    Now...I'm not saying this is the ultimate solution to your situation but I am saying you can accomplish this with code. Whether it's the right or wrong way is up to ya'all to decide. There are other ways to accomplish this with faster more streamline code and API's.

    Attached Files Attached Files
    Environment:
    Self Taught In ALL Environments.....And It Shows!


  12. #12
    Join Date
    Aug 2007
    Posts
    88
    Quote Originally Posted by osilvaware
    Hi there,

    My apologies, but I clearly understood your question right now.

    And YES! There is a quiet simple solution for your problem.

    Import your text file into an access table, then create a query that will group the records. You achive this creating a normal query then pulse the sum symbol button and verify that the new line (Total) have the value Group By in it. But remember... the query should only have the four fields you want to search for duplication, otherwise the first answer I gave you would be correct.

    HTH

    OS
    Hi OS,

    Don't say apologies, please. I really appreciate your suggestion.

    newguy

  13. #13
    Join Date
    Aug 2007
    Posts
    88

    Thanks.

    Quote Originally Posted by healdem
    if you are running a batch process, under windows scheduler then is it reasonable to expect user intervention on error.

    In my experience you have few options
    either
    make assumptions about the error (ie is it an obvious erro that can be safely deleted /ignored
    or
    terminate processing and restore to a known good state, ideally a backup taken immediately before the batch process kicked off.. in an ideal world you could take a backup, work on the backup, if the process completes with no errors copy (or rename) the backup to the working backend.
    or
    isolate the records that have faults (similar to the process used to import records into access, and then let the user decide what they want to do with those records).

    some of that depends on what you want to do, how criticla the application is and how sensitive the data is.

    Id agree the 'best' way round your problem is to write each record to the table, check for an error and handle any error accordingly. ideally you should probably check to see if the record exists prior to writing, but if the unique items are part of the key, for performance reasons I'd be minded to write the record anyway, and trap for a duplicate record error

    it does mean having to write some vba to open the file, read through it sequentially and insert records into your table(s). but if its a regualr process I'd go down the VBA route.. call the function/macro as part of the command line switch that wakes access as part of your windows scheduler event
    Hi healdem,
    Thanks for your response.
    Thanks for willing to share your experience. From your statements, it seems like windows scheduler task is not a good choice. Thanks. I will avoid that.
    Working on the backup first then copy to working data sounds a very good practice.

    Sorry, I am not up to that level yet, so I can not fully understand all of your statements. "call the function/macro as part of the command line switch that wakes access as part of your windows scheduler event"
    I will have to study hard to know how to implement that.

    Thanks again!

    Newguy

  14. #14
    Join Date
    Aug 2007
    Posts
    88

    Super!

    Quote Originally Posted by CyberLynx
    Ahhh...I got bored so I made a simple sample (attached) of what I think you are wanting to do. It looks like a lot of code but really 90 percent of it is comments explaining what does what.

    With code you can do really do whatever you want.

    Now...I'm not saying this is the ultimate solution to your situation but I am saying you can accomplish this with code. Whether it's the right or wrong way is up to ya'all to decide. There are other ways to accomplish this with faster more streamline code and API's.

    Hi CyberLynx,

    Thank you so much for being so bored to write such a "simple sample". (I had reviewed 3 times, but still can not fully understand the code.) Plus, thanks for the very detail explanations for the code in comments. That's really kind. Now I really "stuck on your opinion". Maybe I am not smart enough, I will need to spend more time to study your code.
    I tested it, and it works like a charm. Perfect! It is exactly what I am looking for. It really amazing!
    There is only one small issue in the program. In your testdata, there are 2 lines not unique and are also exactly the same. 1626 4 2 Back Reflection -80.00. The program detect and prompt this duplicate line, but it also "delete" both of them. Any way to delete only one of them? (It could happen in real world.) Thanks.


    The really formate of the file is like
    5/29/2007-9:55 AM 1311 Initial 23C 50%RH 1 1 Relative Power -0.26 4

    it's a space and tab mixed delimited file. How to modify the code to resolve this?

    For each endno (1, 2) in Each sampleno (1, 2, 3, ...22), it must be tested in each test criteria (1311 or 1558 or 1484 or 1626) for each test type (Relative Power, or Back Reflection) (these 4 field's combination must be unique). As you can see in my first post for this issue. If there are more than 4 entries, delete the duplicate lines. Is there any way to prompt the tester if he forget one required test (less than 4 entries)? (If you are still bored.)

    Thank you so much! You are really a code master!

    Newguy

  15. #15
    Join Date
    Nov 2003
    Posts
    1,487
    From one Grasshoppa to another

    I have to this point opened a doorway for you to pass into another realm. For you to succeed there with your quest, you will need to somewhat understand the environment you are standing in otherwise the future will be fruitless.

    Take the time to fully understand the simple code provided to you for if you do not, the time ahead looks grim toward any other enhancements you may endeavour to undertake with regards to your project if this is indeed the path you wish to take.

    Environment:
    Self Taught In ALL Environments.....And It Shows!


Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •