I am a bit of a SQL Server newbie and have a question. I'm trying to compare two text fields. Both are 56 character fields and are a company name (one company want to see how much customer overlap they have with a newly acquired company). As you can imagine the names are a bit different in each ABN AMRO versus ABN AMRO INC. I tried comparing the first 6, 7, 8 characters with some success. Is there a more advanced way to do this? I appreciate the help.
You can look at the SOUNDEX function to see whether it would be appropriate. Also, below is an Access Basic function I used to use for finding duplicate business names with spelling variations in a large Access database.
You could use this by linking Access to your SQL Server, or convert the logic to a stored procedure. The function works better if you strip the submitted text strings of all non-alphanumeric and vowel characters. Any match over 80% is almost certainly a duplicate, and anything over 50% should be flagged for manual review.
Function MATCH(FIRSTSTRING, SECONDSTRING) As Double
Dim COUNTER As Integer
'Returns a value between 0 and 1 indicating the degree of
'similarity between two strings. Similarity is defined as
'the number of two-character combinations the strings have
'in common divided by the possible matches.
HITS = 0
POSSIBLES = 1
If Len(STRING1) > 0 Then
If Len(STRING2) > 0 Then
POSSIBLES = Len(STRING1) + Len(STRING2) - 2
For COUNTER = 1 To Len(STRING1) - 1
SEARCHSTRING = Mid(STRING1, COUNTER, 2)
If InStr(STRING2, SEARCHSTRING) > 0 Then HITS = HITS + 1
For COUNTER = 1 To Len(STRING2) - 1
SEARCHSTRING = Mid(STRING2, COUNTER, 2)
If InStr(STRING1, SEARCHSTRING) > 0 Then HITS = HITS + 1
If HITS = 0 Then
MATCH = 0#
Else: MATCH = HITS / POSSIBLES
If it's not practically useful, then it's practically useless.