Results 1 to 2 of 2
  1. #1
    Join Date
    Apr 2007

    Unanswered: Lookup a "closest match" value in a table


    I have a 2 column table called tblstorecat. It has a text field and a corresponding number field: storename & storenumber

    I have a data entry form that has a text field called title. When a user types in the title of the product, I would like to populate another field, in the same table as title, called newstorenumber.

    So I need either a query or event code to take the text from title and compare it to storename, retrieve the corresponding storenumber, and save that number into newstorenumber.

    Finally, storename will contain strings like "pepsi" or "coke" but title will contain strings like "pepsi can" or "coke bottle" so I need a way for access to give me the closest match.

    So user types "pepsi can" and suddenly newstorenumber = 1 (for pepsi)

    It doesn't need to be visible to the user, so a query expression would be fine.

    Thank you very much in advance

  2. #2
    Join Date
    Dec 2002
    Préverenges, Switzerland
    try looking for 'trigram' with google.

    the basic idea is to make an index table that holds e.g. for 'pepsi' entries for 'pep', 'eps', and 'psi' each with FK to the 'pepsi' record.
    then break the search-for string into 'trigrams' and search in your index table for each of them. the more hits (for equal length indexed words!!!) the more 'relevant' the match.

    it is not perfect!

    i have some old sort-of-trigram test code using dummy data (data is 974 first-names that were most popular in the US in 2004). i added 'pepsi' & 'coke' & 'sprite' as names and tried a few tests.

    'a big can of pepsi' - pepsi is the only hit
    'can of pepsy' - pepsi is the only hit (similarly with 'depsi' 'pypsi' ...because 'pep', 'eps', 'psi' are rare(!) in US names)
    'i prefer coke' - coke is top followed by 7 others
    'a big can of pepsi is my favorite' - pepsi is top, followed by sprite, trevor, gustavo and 11 others
    'sprite is so nice' - sprite is top followed by 18 others
    'i love koke' - equal weighting for coke & brooke

    so far so good

    'peksi' - no hits
    'a big can of pepsi is my destiny' - destiny is the top hit, then pepsi, then 26 others
    'marshmallow and pepsi' - marshall, mallory, then pepsi, then 73 others

    not surprising: nothing matches with the 'k' typo (it kills all three valid trigrams for pepsi). for the other two, there are more matching trigrams in the wrong words so the weightings get screwed.

    Last edited by izyrider; 04-16-07 at 13:20. Reason: ooooops: stupid typo
    currently using SS 2008R2

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts