Results 1 to 4 of 4
  1. #1
    Join Date
    Jan 2010
    Posts
    7

    Unanswered: PostCode: Iterative partial match

    Hi there
    I've got an interesting problem with postcode searches. The question is how to find the nearest match if the exact isn't found using SQL
    (ie in a program it would be IF +100 then remove a trailing char and try again)

    Example Table
    ==========
    PostCode Long_Lat_Value
    --------- --------------
    DE4 Value#1
    DE45 Value#2
    DE45UA Value#3
    DE45UM Value#4
    DE45UZ Value#5

    So if I search for DE45UZ I get Value#5
    If I search for DE45UG then I would want to iterate by removing trailing chars until I found a match ... in this case a match would be on DE45 and give me Value#2
    And finally if I search for DE49AJ I iterate again and the match would return Value#1

    I'm sure this will have been done before using CRTs and fancy SQL so apologies if this is a repeat.
    Thanks in advance to any help
    Cheers
    Simon

  2. #2
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    The sample query returned this result:
    Code:
    SEARCH_WORD POSTCODE   LONG_LAT_VALUE
    ----------- ---------- --------------
    BUZ         No matched               
    DE45UG      DE45       Value#2       
    DE45UZ      DE45UZ     Value#5       
    DE49AJ      DE4        Value#1       
    
      4 record(s) selected.
    sample query:
    Code:
    ------------------------------ Commands Entered ------------------------------
    WITH sample_data
    (postcode , long_lat_value) AS (
    VALUES
      ('DE4'    , 'Value#1')
    , ('DE45'   , 'Value#2')
    , ('DE45UA' , 'Value#3')
    , ('DE45UM' , 'Value#4')
    , ('DE45UZ' , 'Value#5')
    )
    , search_words(search_word) AS (
    VALUES 'DE45UZ' , 'DE45UG' , 'DE49AJ' , 'BUZ'
    )
    SELECT search_word
         , CASE
           WHEN matched_rank >= 0 THEN
                postcode
           ELSE 'No matched'
           END  AS postcode
         , CASE
           WHEN matched_rank >= 0 THEN
                long_lat_value
           ELSE ''
           END  AS long_lat_value
      FROM (SELECT search_word
                 , postcode , long_lat_value
                 , matched_rank
                 , ROW_NUMBER() OVER(PARTITION BY search_word
                                         ORDER BY matched_rank DESC NULLS LAST) AS rn
              FROM (SELECT search_word
                         , postcode , long_lat_value
                         , MIN(CASE
                               WHEN LEFT(search_word , rank) = postcode
                               THEN rank
                               END
                              ) AS matched_rank
                     FROM  search_words s
                     CROSS JOIN
                           sample_data  t
                     INNER JOIN
                           (VALUES 1,2,3,4,5,6,7,8,9,10) r(rank)
                       ON  LENGTH(search_word) >= rank
                     GROUP BY
                           search_word
                         , postcode , long_lat_value
                   )
           )
     WHERE rn = 1
    ;
    ------------------------------------------------------------------------------
    
    SEARCH_WORD POSTCODE   LONG_LAT_VALUE
    ----------- ---------- --------------
    BUZ         No matched               
    DE45UG      DE45       Value#2       
    DE45UZ      DE45UZ     Value#5       
    DE49AJ      DE4        Value#1       
    
      4 record(s) selected.
    Last edited by tonkuma; 07-01-10 at 09:28. Reason: Changed 3 phrase(markd bold, red)

  3. #3
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    Nested table expressions could be reduced to one and join r(rank) was not necessary.

    Another example:
    Code:
    ------------------------------ Commands Entered ------------------------------
    WITH sample_data
    (postcode , long_lat_value) AS (
    VALUES
      ('DE4'    , 'Value#1')
    , ('DE45'   , 'Value#2')
    , ('DE45UA' , 'Value#3')
    , ('DE45UM' , 'Value#4')
    , ('DE45UZ' , 'Value#5')
    )
    , search_words(search_word) AS (
    VALUES 'DE45UZ' , 'DE45UG' , 'DE49AJ' , 'BUZ'
    )
    SELECT search_word
         , COALESCE(postcode , 'No matched') AS postcode
         , COALESCE(long_lat_value , '')     AS long_lat_value
      FROM (SELECT s.* , t.*
                 , ROW_NUMBER()
                     OVER(PARTITION BY search_word
                              ORDER BY postcode DESC) AS rn
              FROM search_words s
              LEFT OUTER JOIN
                   sample_data  t
               ON  LEFT(search_word , LENGTH(postcode) ) = postcode
           )
     WHERE rn = 1
    ;
    ------------------------------------------------------------------------------
    
    SEARCH_WORD POSTCODE   LONG_LAT_VALUE
    ----------- ---------- --------------
    BUZ         No matched               
    DE45UG      DE45       Value#2       
    DE45UZ      DE45UZ     Value#5       
    DE49AJ      DE4        Value#1       
    
      4 record(s) selected.
    Last edited by tonkuma; 07-01-10 at 23:41. Reason: Replace with a simple way using LEFT OUTER JOIN.

  4. #4
    Join Date
    Jan 2010
    Posts
    7

    Smile

    Hi tonkuma
    Thanks a million for this :-) Utterly superb and elegant.
    I've tweaked it for zOS SQL and this is the result:


    WITH SAMPLE_DATA(POSTCODE, LONG_LAT_VALUE) AS (
    SELECT 'DE4' , 'Value#1' FROM SYSIBM.SYSDUMMY1 UNION ALL
    SELECT 'DE45' , 'Value#2' FROM SYSIBM.SYSDUMMY1 UNION ALL
    SELECT 'DE45UA', 'Value#3' FROM SYSIBM.SYSDUMMY1 UNION ALL
    SELECT 'DE45UM', 'Value#4' FROM SYSIBM.SYSDUMMY1 UNION ALL
    SELECT 'DE45UZ', 'Value#5' FROM SYSIBM.SYSDUMMY1
    )

    SELECT COALESCE(X.POSTCODE, 'No match'), COALESCE(X.LONG_LAT_VALUE, '')
    FROM (
    SELECT T.POSTCODE, T.LONG_LAT_VALUE
    , ROW_NUMBER()
    OVER(PARTITION BY SEARCH_WORD
    ORDER BY POSTCODE DESC) AS RN
    FROM (SELECT RPAD(REPLACE(UCASE('De4 5uQ'),' ', ''),10,' ') AS SEARCH_WORD FROM SYSIBM.SYSDUMMY1) AS S
    LEFT OUTER JOIN SAMPLE_DATA AS T
    ON LEFT(S.SEARCH_WORD, LENGTH(T.POSTCODE) ) = POSTCODE
    ) AS X
    WHERE RN = 1


    Works a treat
    Thanks again for your ultra-quick, excellent response
    Cheers
    Simon

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •