Page 1 of 2 12 LastLast
Results 1 to 15 of 30
  1. #1
    Join Date
    Dec 2012
    Location
    Huntsville, AL
    Posts
    379
    Provided Answers: 2

    Answered: Determining If Common Values Exist Between Two Pieces of Data

    I have an interesting problem that NEEDS an interesting solution. For this example, assume that I have two sets of data that contains words separated by spaces. Set 1 contains 'green yellow orange red silver' set 2 contains 'black blue red white pink green'. I need to find out if any of the values in set 1 are contained in the data in set 2 - in this example, the subset I am looking for contains 'green' and 'red' as these two individual values are present in both set 1 and set 2. The number of exact matches themselves are not important for this project, all I need to know is if there are ANY common values between the two data sets (I think the proper terminology is 'intersection of sets').

    I can do this by breaking down one of the sets into its individual components and scripting to check if this information is contained in the other set. For example the scripting logic would be like this 'If [$set 1] Contains 'black'... do this' yet I would have to use a 'For Each... Next' command to loop back a total of six times to check each of the individual values in set 2 via establishing a variable and using 'Regular Expressions' ('... blue red white pink green').

    I know that this method will work but it is slow and I have thousands of set 1 data that must be compared to set 2 data (I just scripted this somewhat convoluted logic this morning... it works but it takes some time!). I am curious if any of the EXCEPTIONALLY bright people here know of a more efficient method to compare two sets of data together to see if any common values are present. All I need to know is that either 'Yes' (1) or 'No' (0) concerning the existence of common values between the two sets. If so, how can I go about doing this in Brilliant Database?
    Daryl G

  2. Best Answer
    Posted by MacHeuS


  3. #2
    Join Date
    Apr 2013
    Posts
    226
    Presumably, as you are only interested in any match you don't necessarily need to loop six times, as you can drop out of the loop as soon as any match is found.

    Can this be done with a query made up of a series of OR rules? Not sure whether performance will be any better as you would be using "contains".

    Matty

  4. #3
    Join Date
    Dec 2012
    Location
    Huntsville, AL
    Posts
    379
    Provided Answers: 2
    matty1965,
    The information contained within each data set changes with each comparison (I failed to include that vital piece of information). Because of that dynamic nature, I do not believe that a query would work.

    Unfortunately, Brilliant Database will NOT allow you to drop out of a loop when a match is made - that ability would work out extremely nice but once you start a loop, it will run through its entirety regardless of any 'Break', 'Goto... Labels', or usage of 'If' statements. I have tried every trick I know for exiting a 'For Each.. Next' loop prematurely/early and that ability is not present. If you have been able to make this work, please share!
    Daryl G

  5. #4
    Join Date
    Jul 2012
    Location
    Ireland
    Posts
    815
    Provided Answers: 17
    Does the number of words change ?
    Is it always 5 or some other set amount?

    Regards
    James
    Last edited by Tubbritt; 07-21-14 at 21:34.

  6. #5
    Join Date
    Jul 2012
    Location
    Ireland
    Posts
    815
    Provided Answers: 17
    Here's an example of a single query that does a dynamic search or records in another folder based on the selected record.

    The folder “Blanks” contains 5 records. Each record contains a single text field that's populated with 5 words separated by spaces.

    When you run the query “Find Words” it will run a script first that will extract the 5 words from the selected records text field and populate each word into a separate variable.

    Now that we have the 5 words separated into variables the actual query search runs and it searches the folder "History" for matches based on “Contains”, and “OR”

    Regards
    James.
    Attached Files Attached Files

  7. #6
    Join Date
    Apr 2013
    Posts
    226
    I've never experienced the inability to drop out of a loop if certain conditions are met. Has anyone else?

  8. #7
    Join Date
    Dec 2012
    Location
    Huntsville, AL
    Posts
    379
    Provided Answers: 2
    James,
    Both sets of data change - there is never a constant or repeatable amount of data. Set 1 may have only one word whereas set 2 may have 12 words. The next comparison could have eight words in set 1 and three words in set two. Because of this dynamic nature of the data, I do not think that a query could be used successfully for this analysis. Looping via a 'For Each... Next' does work but it takes too long for my liking.

    matty1965,
    If you make a simple 'For Each... Next' loop and step through the script line by line I suspect that you will see that it will loop the whole original number of loops regardless of any attempts at exiting the loop early. I have tried all options that I am aware of and NOTHING would change that. Changing the value of the variable controlling the number of loops did nothing, 'Goto Labels' was ignored, and various attempts at using VBScripts (which Brilliant Database claims to support) had no affect.

    I would love to see this ability actually work within Brilliant Database but it does not. I reported this bug to Brilliant Database last year and received a rely that they, too, were able to replicate the problem and said that it would be corrected in future releases (this was prior to V10 coming out) yet it still exists. I have not tried it in v10.54... maybe it has been fixed but I know it does not work in v10.5. If you were able to make it work in the current release version I would greatly appreciate knowing what you did!
    Daryl G

  9. #8
    Join Date
    Jul 2012
    Location
    Ireland
    Posts
    815
    Provided Answers: 17
    Hi Daryl.
    If the amount of words is unknown, is there's any known value?
    For example, do you know that it will never exceed a maximum of 24 words for example?

    Regards
    James

  10. #9
    Join Date
    Apr 2013
    Posts
    226
    Hi Daryl,

    To drop out of a For Each you have to insert a "Break" which you can find in the "Add Cycle" menu.

    Matty

  11. #10
    Join Date
    Dec 2012
    Location
    Huntsville, AL
    Posts
    379
    Provided Answers: 2
    James,
    For argument's sake, assume 12 words as a maximum for set 1 and set 2. In actual practice, 12 words should be the maximum for one set while the other set could have hundreds (limited only to the number that can be held in a Recordset). That would be a massive query (if that is what you are thinking). Yes, I realize that this is an odd problem but that is what I am trying to tackle! Given the constraints of the problem, I do not think that it can be dealt with in a simpler fashion.

    matty1965,
    I have tried that and it does not work. I will post an example later to demonstrate that once you start a 'For Each... Next' loop in Brilliant Database that it cannot be stopped early regardless of what instructions/scripting are included instructing it to do otherwise. I have fought this issue for a very long time.
    Daryl G

  12. #11
    Join Date
    Apr 2013
    Posts
    226
    Hi Daryl,

    I suggest upgrading to 10.54 as I have applications with scripts breaking out of For Each loops by using Break.

    Matty

  13. #12
    Join Date
    Apr 2013
    Posts
    226
    Hi Daryl,

    Working example attached. Click on test button and select all the records. You can then choose when/if to break out of the For Each loop.

    Matty

    PS This is the normal way of exiting a loop in many programming languages.
    Attached Files Attached Files
    Last edited by matty1965; 07-22-14 at 17:59.

  14. #13
    Join Date
    Jul 2012
    Posts
    94
    Provided Answers: 4
    Did you try this? Is this slow?
    Attached Files Attached Files
    I do not know English well. I am using Google Translate for translation. Please do not laugh my sentences.

  15. #14
    Join Date
    Jul 2012
    Location
    Ireland
    Posts
    815
    Provided Answers: 17
    Hi Daryl.
    I seem to have completely misunderstood what you are trying to do.

    I assumed you have a record selected that contains a field which could contain upwards of 12 words, and you wanted to run a search to see if any other records from a different folder contained any of the 12 values from the selected record.

    Sounds to me now like your trying to compare a batch of records against another batch of records.

    Regards
    James

  16. #15
    Join Date
    Dec 2012
    Location
    Huntsville, AL
    Posts
    379
    Provided Answers: 2
    James,
    No, your earlier understanding is correct. Just as an example of what I am looking at right now consider this:

    one variable contains this information - 5712 45620 11284 2761 78959 2851 5723 V0382 2768 5728 53530 5533

    the second variable contains - 9915 5457 7534 3722 2851 741 3897 3895 9907 4523 8856 9390 734 5794 4513 4516 8872 9671 3395 8927 0001 045 0099 0045 3895 9907 640 23186 007 V0555 7860 E009 2768

    The information I need to know is whether or not both data sets contain any of the same 'pieces' (in this example, 2851 and 2768 appear in both variables). There are slightly more than 105K records that will be compared to the second variable (note that the second variable may change each time the analysis is run so it, too, is dynamic). I am able to get correct answers now using my method but it is slow. I am going to test out MacHeuS's suggestion shortly.
    Daryl G

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •