| |
|
If this is your first visit, be sure to check out the FAQ by clicking the link above.
You may have to register before you can post: click the register link above to proceed.
To start viewing messages, select the forum that you want to visit from the selection below.
|
 |

07-01-10, 04:00
|
|
Registered User
|
|
Join Date: Jan 2010
Posts: 6
|
|
|
PostCode: Iterative partial match
|
|
Hi there
I've got an interesting problem with postcode searches. The question is how to find the nearest match if the exact isn't found using SQL
(ie in a program it would be IF +100 then remove a trailing char and try again)
Example Table
==========
PostCode Long_Lat_Value
--------- --------------
DE4 Value#1
DE45 Value#2
DE45UA Value#3
DE45UM Value#4
DE45UZ Value#5
So if I search for DE45UZ I get Value#5
If I search for DE45UG then I would want to iterate by removing trailing chars until I found a match ... in this case a match would be on DE45 and give me Value#2
And finally if I search for DE49AJ I iterate again and the match would return Value#1
I'm sure this will have been done before using CRTs and fancy SQL so apologies if this is a repeat.
Thanks in advance to any help
Cheers
Simon
|
|

07-01-10, 07:36
|
|
Registered User
|
|
Join Date: Feb 2008
Location: Japan
Posts: 2,195
|
|
The sample query returned this result:
Code:
SEARCH_WORD POSTCODE LONG_LAT_VALUE
----------- ---------- --------------
BUZ No matched
DE45UG DE45 Value#2
DE45UZ DE45UZ Value#5
DE49AJ DE4 Value#1
4 record(s) selected.
sample query:
Code:
------------------------------ Commands Entered ------------------------------
WITH sample_data
(postcode , long_lat_value) AS (
VALUES
('DE4' , 'Value#1')
, ('DE45' , 'Value#2')
, ('DE45UA' , 'Value#3')
, ('DE45UM' , 'Value#4')
, ('DE45UZ' , 'Value#5')
)
, search_words(search_word) AS (
VALUES 'DE45UZ' , 'DE45UG' , 'DE49AJ' , 'BUZ'
)
SELECT search_word
, CASE
WHEN matched_rank >= 0 THEN
postcode
ELSE 'No matched'
END AS postcode
, CASE
WHEN matched_rank >= 0 THEN
long_lat_value
ELSE ''
END AS long_lat_value
FROM (SELECT search_word
, postcode , long_lat_value
, matched_rank
, ROW_NUMBER() OVER(PARTITION BY search_word
ORDER BY matched_rank DESC NULLS LAST) AS rn
FROM (SELECT search_word
, postcode , long_lat_value
, MIN(CASE
WHEN LEFT(search_word , rank) = postcode
THEN rank
END
) AS matched_rank
FROM search_words s
CROSS JOIN
sample_data t
INNER JOIN
(VALUES 1,2,3,4,5,6,7,8,9,10) r(rank)
ON LENGTH(search_word) >= rank
GROUP BY
search_word
, postcode , long_lat_value
)
)
WHERE rn = 1
;
------------------------------------------------------------------------------
SEARCH_WORD POSTCODE LONG_LAT_VALUE
----------- ---------- --------------
BUZ No matched
DE45UG DE45 Value#2
DE45UZ DE45UZ Value#5
DE49AJ DE4 Value#1
4 record(s) selected.
|
Last edited by tonkuma; 07-01-10 at 08:28.
Reason: Changed 3 phrase(markd bold, red)
|

07-01-10, 08:57
|
|
Registered User
|
|
Join Date: Feb 2008
Location: Japan
Posts: 2,195
|
|
|
|
Nested table expressions could be reduced to one and join r(rank) was not necessary.
Another example:
Code:
------------------------------ Commands Entered ------------------------------
WITH sample_data
(postcode , long_lat_value) AS (
VALUES
('DE4' , 'Value#1')
, ('DE45' , 'Value#2')
, ('DE45UA' , 'Value#3')
, ('DE45UM' , 'Value#4')
, ('DE45UZ' , 'Value#5')
)
, search_words(search_word) AS (
VALUES 'DE45UZ' , 'DE45UG' , 'DE49AJ' , 'BUZ'
)
SELECT search_word
, COALESCE(postcode , 'No matched') AS postcode
, COALESCE(long_lat_value , '') AS long_lat_value
FROM (SELECT s.* , t.*
, ROW_NUMBER()
OVER(PARTITION BY search_word
ORDER BY postcode DESC) AS rn
FROM search_words s
LEFT OUTER JOIN
sample_data t
ON LEFT(search_word , LENGTH(postcode) ) = postcode
)
WHERE rn = 1
;
------------------------------------------------------------------------------
SEARCH_WORD POSTCODE LONG_LAT_VALUE
----------- ---------- --------------
BUZ No matched
DE45UG DE45 Value#2
DE45UZ DE45UZ Value#5
DE49AJ DE4 Value#1
4 record(s) selected.
|
Last edited by tonkuma; 07-01-10 at 22:41.
Reason: Replace with a simple way using LEFT OUTER JOIN.
|

07-02-10, 05:17
|
|
Registered User
|
|
Join Date: Jan 2010
Posts: 6
|
|
Hi tonkuma
Thanks a million for this :-) Utterly superb and elegant.
I've tweaked it for zOS SQL and this is the result:
WITH SAMPLE_DATA(POSTCODE, LONG_LAT_VALUE) AS (
SELECT 'DE4' , 'Value#1' FROM SYSIBM.SYSDUMMY1 UNION ALL
SELECT 'DE45' , 'Value#2' FROM SYSIBM.SYSDUMMY1 UNION ALL
SELECT 'DE45UA', 'Value#3' FROM SYSIBM.SYSDUMMY1 UNION ALL
SELECT 'DE45UM', 'Value#4' FROM SYSIBM.SYSDUMMY1 UNION ALL
SELECT 'DE45UZ', 'Value#5' FROM SYSIBM.SYSDUMMY1
)
SELECT COALESCE(X.POSTCODE, 'No match'), COALESCE(X.LONG_LAT_VALUE, '')
FROM (
SELECT T.POSTCODE, T.LONG_LAT_VALUE
, ROW_NUMBER()
OVER(PARTITION BY SEARCH_WORD
ORDER BY POSTCODE DESC) AS RN
FROM (SELECT RPAD(REPLACE(UCASE('De4 5uQ'),' ', ''),10,' ') AS SEARCH_WORD FROM SYSIBM.SYSDUMMY1) AS S
LEFT OUTER JOIN SAMPLE_DATA AS T
ON LEFT(S.SEARCH_WORD, LENGTH(T.POSTCODE) ) = POSTCODE
) AS X
WHERE RN = 1
Works a treat 
Thanks again for your ultra-quick, excellent response
Cheers
Simon
|
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|