Results 1 to 14 of 14
  1. #1
    Join Date
    May 2012
    Posts
    13

    Unanswered: Soundex'd Searching

    Hi,

    I'm a student employee for a Canadian company and have barely touched, let alone heard of, DB2 until this point. I'm tasked with creating search functionality (it was a requirement for the system I was put on) that both ignores the order of search terms and does "sound alikes". tonkuma had helped me a great deal with Levenshtein distances last week and was very helpful, but unfortunately it was a bit too slow.

    The situation is that I am searching though book or video titles (10,000 rows, 1-15 words each), and it was mentioned "sound alikes" (AKA soundex" should be taken into account, as well as things like "Heating" being close to "Heater" (so I assume a soundex difference >=2 should count for something rather than being thrown away).

    My first thought was, for each row, split into words (remove punctuation, split on spaces, throw into temp table), and then do an exclude on a temp table with my soundexed search term, sorting by the least number of leftover rows (row count on the exclude) ascending. However this doesnt take into account for partial soundex matches (no "difference()").

    I have no set "due date" on this problem as of now, but want to tackle this and get it out of the way rather than having a contractor come in.

    Does anybody have any advice on performing a fuzzy soundex, order irrelevant search? I'm aware that older versions have limitations, so I think the version we are on is "v5r4m0". I'd be appreciated, as I've been spinning my wheels in this for a few weeks now and though Levenshtein was very nice, it was slower and didn't work fast enough for long titles (1-15 words).

    Thanks!

  2. #2
    Join Date
    Oct 2009
    Location
    221B Baker St.
    Posts
    486
    Which platform (hardware/operating system/db2 release) is being used?

  3. #3
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    I might be not fully understand your requirements.
    For example, I couldn't understand well this sentence...
    and then do an exclude on a temp table with my soundexed search term, sorting by the least number of leftover rows (row count on the exclude) ascending.
    Some examples would be much help to understand your requirements.

    By the way,
    how about to take two steps, like I wrote before?
    Quote Originally Posted by tonkuma View Post
    Another approach might be
    (1) Extract each words separated by blanks from strings.
    (2) Evaluate number of similar words.

    For step (1), Example 2: extract_elements UDF in this thread http://www.dbforums.com/db2/1677358-...sible-sql.html
    might be usable.

    For step (2), Some built-in functions(like SOUNDEX, DIFFERENCE) or UDFs(like Example 1, 2 in this thread) might be usable.

    Another my concern for step (2) is which combination of words should be evaluated?
    For example: should lack of words or exchange of words sequence be considered or not?

  4. #4
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    Here is a very simple example.

    Although tested on DB2 9.7 for LUW, I tried to meet older DB2 for i Series.

    (1) Extract each words separated by blanks from strings.

    (1-1) Make a table function.
    Code:
    ------------------------------ Commands Entered ------------------------------
    CREATE FUNCTION split_words( in_string VARCHAR(255) )
    RETURNS
     TABLE( n SMALLINT , word VARCHAR(50) )
    LANGUAGE SQL
    DETERMINISTIC
    NO EXTERNAL ACTION
    READS SQL DATA
    /************************************************************
    ********** split words in input string by blank(s) **********
    ************************************************************/
    RETURN
    WITH rcte(n , word , pos , norm_string) AS (
    SELECT 0
         , CAST(null AS VARCHAR(50) )
         , 0
         , REPLACE(
              REPLACE(
                 REPLACE(
                    TRIM(in_string) , ' ' , '<>'
                 )
               , '><' , ''
              )
            , '<>' , ' '
           ) || ' ' 
     FROM  sysibm.sysdummy1
    UNION ALL
    SELECT n + 1
         , SUBSTR(  norm_string
                  , pos + 1
                  , NULLIF( LOCATE(' ' , norm_string , pos + 1) , 0 )
                    - pos - 1
                 )
         , LOCATE(' ' , norm_string , pos + 1)
         , norm_string
     FROM  rcte
     WHERE n < 50
       AND
      (    n   =  0
       OR  pos <> 0
      )
    )
    SELECT n , word
     FROM  rcte
     WHERE word IS NOT null
    ;
    ------------------------------------------------------------------------------
    DB20000I  The SQL command completed successfully.
    (1-2) Sample usage of the function.
    Code:
    ------------------------------ Commands Entered ------------------------------
    WITH
      test_data(id , title) AS (
    SELECT 1 , 'Heater'                    FROM sysibm.sysdummy1 UNION ALL
    SELECT 2 , 'Heating '                  FROM sysibm.sysdummy1 UNION ALL
    SELECT 3 , ' My  Own    Gas  Lamp '    FROM sysibm.sysdummy1 UNION ALL
    SELECT 4 , 'My Heavy Health   Meter'   FROM sysibm.sysdummy1 UNION ALL
    SELECT 5 , 'Eating Heated Horseradish' FROM sysibm.sysdummy1
    )
    SELECT id
         , title
         , n
         , word
         , DIFFERENCE(word , 'Heating') AS difference
     FROM  test_data
         , TABLE( split_words(title) ) AS tf
     ORDER BY
           id
         , n
    ;
    ------------------------------------------------------------------------------
    
    ID          TITLE                     N      WORD                                               DIFFERENCE 
    ----------- ------------------------- ------ -------------------------------------------------- -----------
              1 Heater                         1 Heater                                                       2
              2 Heating                        1 Heating                                                      4
              3  My  Own    Gas  Lamp          1 My                                                           0
              3  My  Own    Gas  Lamp          2 Own                                                          0
              3  My  Own    Gas  Lamp          3 Gas                                                          0
              3  My  Own    Gas  Lamp          4 Lamp                                                         0
              4 My Heavy Health   Meter        1 My                                                           0
              4 My Heavy Health   Meter        2 Heavy                                                        1
              4 My Heavy Health   Meter        3 Health                                                       1
              4 My Heavy Health   Meter        4 Meter                                                        1
              5 Eating Heated Horseradish      1 Eating                                                       3
              5 Eating Heated Horseradish      2 Heated                                                       2
              5 Eating Heated Horseradish      3 Horseradish                                                  1
    
      13 record(s) selected.

    (2) Evaluate each strings based on most matched word(s) in the string.
    Code:
    ------------------------------ Commands Entered ------------------------------
    WITH
      test_data(id , title) AS (
    SELECT 1 , 'Heater'                    FROM sysibm.sysdummy1 UNION ALL
    SELECT 2 , 'Heating '                  FROM sysibm.sysdummy1 UNION ALL
    SELECT 3 , ' My  Own    Gas  Lamp '    FROM sysibm.sysdummy1 UNION ALL
    SELECT 4 , 'My Heavy Health   Meter'   FROM sysibm.sysdummy1 UNION ALL
    SELECT 5 , 'Eating Heated Horseradish' FROM sysibm.sysdummy1
    )
    SELECT id
         , title
         , MAX( DIFFERENCE(word , 'Heating') ) AS score
     FROM  test_data
         , TABLE( split_words(title) ) AS tf
     GROUP BY
           id
         , title
     ORDER BY
           score DESC
    ;
    ------------------------------------------------------------------------------
    
    ID          TITLE                     SCORE      
    ----------- ------------------------- -----------
              2 Heating                             4
              5 Eating Heated Horseradish           3
              1 Heater                              2
              4 My Heavy Health   Meter             1
              3  My  Own    Gas  Lamp               0
    
      5 record(s) selected.
    Last edited by tonkuma; 05-25-12 at 01:12. Reason: Add TRIM for function.

  5. #5
    Join Date
    May 2012
    Posts
    13
    Thanks tonkuma, you're amazing. But if I understand correctly (I haven't tried finished creating the procedure yet), this would only evaluate single words inputs? Ideally, If I entered "ade frist", my first result back would be "first aid". Maybe I've misunderstood the above, but it looks like that would return where a single search term has matched the best among many words, not how many words of a multi-word search term have matched a multi-word title. The split_words function is actually something I was looking to write regardless, so thank you!

  6. #6
    Join Date
    May 2012
    Posts
    13
    Quote Originally Posted by tonkuma View Post
    I might be not fully understand your requirements.
    For example, I couldn't understand well this sentence...
    I guess I mean that my idea of how two sentences could be compared and rated was like this (except everything would be soundexxed first):

    Search term:
    "How to Ride a Bike"
    Existing title to compare with:
    "Ride a Bike Fast"

    -Remove where soundex's are the same (remove matches):
    Search term:
    "How to"
    Existing title to compare with:
    "Fast"

    -Count all remaining rows
    Leftovers = 3

    So if my search term was "How to Ride a Bike", one row I might get back might be:

    Course ID | Title | Leftovers
    -----------------------------------------------------
    1 | Ride a Bike Fast| 3

    And then I could sort by the ascending "leftovers" to find the closest sentences. That's just my first thought and I am young and inexperienced.

  7. #7
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    First of all,
    please supply more sample data.
    For example:
    Please publish at least 3 or more titles to be compared for one Search term: "How to Ride a Bike"
    Because, it is neccesary to debug SQL code.

    Without enough sample/test data, too easy or too simple SQL might produce expected result.
    Then if you applied the sample SQL to your real data, the result might be different from your expected.
    So, you would ask again.
    Those repetition waste not only your time but also my time.


    By the way,
    I couldn't understand why you want to count remaining rows.
    I thought that counting matched words would be better.
    If you think that counting remaining rows is better, please publish some sample data to certify it.


    Anyhow,
    here is an example
    (1) split words in both of Search_term and title
    (2) extract word-pair having DIFFERENCE(word , search_word) >= 3
    (3) summarize DIFFERENCE(word , search_word) and name it score
    (4) order by largest score first

    Example 1: Tested on DB2 9.7 for Windows.
    Code:
    ------------------------------ Commands Entered ------------------------------
    WITH
      test_data(id , title) AS (
    SELECT 1 , 'Heater'                    FROM sysibm.sysdummy1 UNION ALL
    SELECT 2 , 'Heating '                  FROM sysibm.sysdummy1 UNION ALL
    SELECT 3 , ' My  Own    Gas  Lamp '    FROM sysibm.sysdummy1 UNION ALL
    SELECT 4 , 'My Heavy Health   Meter'   FROM sysibm.sysdummy1 UNION ALL
    SELECT 5 , 'Eating Heated Horseradish' FROM sysibm.sysdummy1 UNION ALL
    SELECT 6 , 'Ride a Bike Fast'          FROM sysibm.sysdummy1 UNION ALL
    SELECT 7 , 'Fast'                      FROM sysibm.sysdummy1
    )
    , parameter(search_term) AS (
    SELECT 'Heating'                       FROM sysibm.sysdummy1 UNION ALL
    SELECT 'How to Ride a Bike'            FROM sysibm.sysdummy1
    )
    SELECT Search_term
         , id
         , title
         , DEC(
              100. * SUM( DIFFERENCE(word , search_word) )
              / ( MAX(max_m) * 4 )
              * COUNT(DISTINCT search_word) / COUNT(*)
            , 5 , 2
           ) AS score
     FROM  parameter
     CROSS JOIN
           LATERAL
          (SELECT n    AS m
                , word AS search_word
                , n - 1
                    + ROW_NUMBER()
                         OVER( ORDER BY n DESC ) AS max_m
            FROM TABLE( split_words(search_term) ) AS fp
          ) AS fq
     LEFT  OUTER JOIN
           test_data
     CROSS JOIN
           TABLE( split_words(title)       ) AS ft
      ON   DIFFERENCE(word , search_word) >= 3
     GROUP BY
           search_term
         , id
         , title
     ORDER BY
           search_term
         , score       DESC
    ;
    ------------------------------------------------------------------------------
    
    SEARCH_TERM        ID          TITLE                     SCORE  
    ------------------ ----------- ------------------------- -------
    Heating                      2 Heating                    100.00
    Heating                      5 Eating Heated Horseradish   75.00
    How to Ride a Bike           6 Ride a Bike Fast            90.00
    How to Ride a Bike           3  My  Own    Gas  Lamp       60.00
    How to Ride a Bike           4 My Heavy Health   Meter     45.00
    
      5 record(s) selected.

  8. #8
    Join Date
    May 2012
    Posts
    13
    Thanks so much, those results are exciting. However, maybe due to a differing version or my own ignorance to db2, I am having an issue. I've tried wrapping it in a function (still with your test data table inside) so that I can call a select on it and view the results. Everything before the "begin" I was told by a co-worker to put before custom functions. And "CHTDEV.FN_SPLIT" is the name of your split function that I had run.

    Code:
    CREATE FUNCTION CHTDEV.FN_SEARCH_COMPARE(  )
    RETURNS  TABLE( search_term varchar(50), id int, title varchar(50), score decimal)
    LANGUAGE SQL 
    	MODIFIES SQL DATA
    	SPECIFIC CHTDEV.FN_SEARCH_COMPARE
    	NOT DETERMINISTIC 
    	CALLED ON NULL INPUT 
    	DISALLOW PARALLEL 
    	NOT FENCED 
    	SET OPTION  ALWBLK = *ALLREAD , 
    	ALWCPYDTA = *OPTIMIZE , 
    	COMMIT = *NONE , 
    	DECRESULT = (31, 31, 00) , 
    	DFTRDBCOL = *NONE , 
    	DYNDFTCOL = *NO , 
    	DYNUSRPRF = *USER , 
    	SRTSEQ = *HEX 
    BEGIN
    RETURN
    
    --function here
    WITH
     --make it look through the course master table below
      test_data(id , title) AS (
    SELECT 1 , 'Heater'                    FROM sysibm.sysdummy1 UNION ALL
    SELECT 2 , 'Heating '                  FROM sysibm.sysdummy1 UNION ALL
    SELECT 3 , ' My  Own    Gas  Lamp '    FROM sysibm.sysdummy1 UNION ALL
    SELECT 4 , 'My Heavy Health   Meter'   FROM sysibm.sysdummy1 UNION ALL
    SELECT 5 , 'Eating Heated Horseradish' FROM sysibm.sysdummy1 UNION ALL
    SELECT 6 , 'Ride a Bike Fast'          FROM sysibm.sysdummy1 UNION ALL
    SELECT 7 , 'Fast'                      FROM sysibm.sysdummy1
    )
    , parameter(search_term) AS (
    SELECT 'Heating'                       FROM sysibm.sysdummy1 UNION ALL
    SELECT 'How to Ride a Bike'            FROM sysibm.sysdummy1
    )
    SELECT search_term
         , id
         , title
         , DEC(
              100. * SUM( DIFFERENCE(word , search_word) )
              / ( MAX(max_m) * 4 )
              * COUNT(DISTINCT search_word) / COUNT(*)
            , 5 , 2
           ) AS score
     FROM  parameter
     CROSS JOIN
           LATERAL
          (SELECT n    AS m
                , word AS search_word
                , n - 1
                    + ROW_NUMBER()
                         OVER( ORDER BY n DESC ) AS max_m
            FROM TABLE( CHTDEV.FN_SPLIT(search_term) ) AS fp
          ) AS fq
     LEFT  OUTER JOIN
           test_data
     CROSS JOIN
           TABLE( CHTDEV.FN_SPLIT(title)       ) AS ft
      ON   DIFFERENCE(word , search_word) >= 3
    
    
     GROUP BY
           search_term
         , id
         , title
    
     ORDER BY
          search_term
        , score       DESC
    ;
    
    END;


    and get the following error:
    Code:
    SQL State: 42601
    Vendor Code: -199
    Message: [SQL0199] Keyword ORDER not expected. Valid tokens: ;. Cause . . . . . :   The keyword ORDER was not expected here.  A syntax error was detected at keyword ORDER.  The partial list of valid tokens is ;. This list assumes that the statement is correct up to the unexpected keyword.  The error may be earlier in the statement but the syntax of the statement seems to be valid up to this point. Recovery  . . . :   Examine the SQL statement in the area of the specified keyword.  A colon or SQL delimiter may be missing. SQL requires reserved words to be delimited when they are used as a name. Correct the SQL statement and try the request again.
    Any idea what this could be from?

    Thanks a bunch

  9. #9
    Join Date
    Mar 2003
    Posts
    280
    Quote Originally Posted by trobinson1 View Post
    Thanks so much, those results are exciting. However, maybe due to a differing version or my own ignorance to db2, I am having an issue. I've tried wrapping it in a function (still with your test data table inside) so that I can call a select on it and view the results. Everything before the "begin" I was told by a co-worker to put before custom functions. And "CHTDEV.FN_SPLIT" is the name of your split function that I had run.

    Code:
    CREATE FUNCTION CHTDEV.FN_SEARCH_COMPARE(  )
    RETURNS  TABLE( search_term varchar(50), id int, title varchar(50), score decimal)
    LANGUAGE SQL 
    	MODIFIES SQL DATA
    	SPECIFIC CHTDEV.FN_SEARCH_COMPARE
    	NOT DETERMINISTIC 
    	CALLED ON NULL INPUT 
    	DISALLOW PARALLEL 
    	NOT FENCED 
    	SET OPTION  ALWBLK = *ALLREAD , 
    	ALWCPYDTA = *OPTIMIZE , 
    	COMMIT = *NONE , 
    	DECRESULT = (31, 31, 00) , 
    	DFTRDBCOL = *NONE , 
    	DYNDFTCOL = *NO , 
    	DYNUSRPRF = *USER , 
    	SRTSEQ = *HEX 
    BEGIN
    RETURN
    
    --function here
    WITH
     --make it look through the course master table below
      test_data(id , title) AS (
    SELECT 1 , 'Heater'                    FROM sysibm.sysdummy1 UNION ALL
    SELECT 2 , 'Heating '                  FROM sysibm.sysdummy1 UNION ALL
    SELECT 3 , ' My  Own    Gas  Lamp '    FROM sysibm.sysdummy1 UNION ALL
    SELECT 4 , 'My Heavy Health   Meter'   FROM sysibm.sysdummy1 UNION ALL
    SELECT 5 , 'Eating Heated Horseradish' FROM sysibm.sysdummy1 UNION ALL
    SELECT 6 , 'Ride a Bike Fast'          FROM sysibm.sysdummy1 UNION ALL
    SELECT 7 , 'Fast'                      FROM sysibm.sysdummy1
    )
    , parameter(search_term) AS (
    SELECT 'Heating'                       FROM sysibm.sysdummy1 UNION ALL
    SELECT 'How to Ride a Bike'            FROM sysibm.sysdummy1
    )
    SELECT search_term
         , id
         , title
         , DEC(
              100. * SUM( DIFFERENCE(word , search_word) )
              / ( MAX(max_m) * 4 )
              * COUNT(DISTINCT search_word) / COUNT(*)
            , 5 , 2
           ) AS score
     FROM  parameter
     CROSS JOIN
           LATERAL
          (SELECT n    AS m
                , word AS search_word
                , n - 1
                    + ROW_NUMBER()
                         OVER( ORDER BY n DESC ) AS max_m
            FROM TABLE( CHTDEV.FN_SPLIT(search_term) ) AS fp
          ) AS fq
     LEFT  OUTER JOIN
           test_data
     CROSS JOIN
           TABLE( CHTDEV.FN_SPLIT(title)       ) AS ft
      ON   DIFFERENCE(word , search_word) >= 3
    
    
     GROUP BY
           search_term
         , id
         , title
    
     ORDER BY
          search_term
        , score       DESC
    ;
    
    END;


    and get the following error:
    Code:
    SQL State: 42601
    Vendor Code: -199
    Message: [SQL0199] Keyword ORDER not expected. Valid tokens: ;. Cause . . . . . :   The keyword ORDER was not expected here.  A syntax error was detected at keyword ORDER.  The partial list of valid tokens is ;. This list assumes that the statement is correct up to the unexpected keyword.  The error may be earlier in the statement but the syntax of the statement seems to be valid up to this point. Recovery  . . . :   Examine the SQL statement in the area of the specified keyword.  A colon or SQL delimiter may be missing. SQL requires reserved words to be delimited when they are used as a name. Correct the SQL statement and try the request again.
    Any idea what this could be from?

    Thanks a bunch
    The function returns a table which by definition is unordered, i.e. you cant order the result. Let the caller of the function do the ordering. Another option is to add the logic to a procedure and return a cursor

  10. #10
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    I don't know the reason to make the function.
    Did you executed my example directly?

    "WITH test_data(...) AS (...) , parameter(...) AS (...) SELECT ..." is a select-statement.
    So, usually it returns some results for you.

    If your client doesn't recognize the syntax or not return results,
    please try to make two tables(test_data and parameter) by CREATE TABLE statements, INSERT data and execute SELECT part only.


    Anyway,
    if you want make your function, remove ORDER BY clause and specify ORDER BY clause in the query which uses the function.

    Another issues is...
    it may be better to remove begin and end surrounding RETURN and my example, even if it worked.
    This function-body have only a return statement. So, begin and end are useless.

  11. #11
    Join Date
    May 2012
    Posts
    13
    Quote Originally Posted by tonkuma View Post
    I don't know the reason to make the function.
    Did you executed my example directly?

    "WITH test_data(...) AS (...) , parameter(...) AS (...) SELECT ..." is a select-statement.
    So, usually it returns some results for you.

    If your client doesn't recognize the syntax or not return results,
    please try to make two tables(test_data and parameter) by CREATE TABLE statements, INSERT data and execute SELECT part only.


    Anyway,
    if you want make your function, remove ORDER BY clause and specify ORDER BY clause in the query which uses the function.

    Another issues is...
    it may be better to remove begin and end surrounding RETURN and my example, even if it worked.
    This function-body have only a return statement. So, begin and end are useless.
    Thanks, I didn't know that

    I've removed the begin/end and moved the ORDER BY to the caller, and still receive:

    Code:
    SQL State: 58004
    Vendor Code: -443
    Message: [SQL0443] CPD4350 error Cause . . . . . :   Either a trigger program, external procedure, or external function detected and returned an error to SQL. If the error occurred in a trigger program, the trigger was on table FN_SE00001 in schema CHTDEV. If the error occurred in an external procedure or function, the external name is FN_SE00001 in schema CHTDEV.  The associated text is CPD4350
    Does that error sound familiar to you or is this something specific on our system I should be asking the admin about?

    Thanks, and I apologize for my ignorance about DB2.

  12. #12
    Join Date
    May 2012
    Posts
    13
    And as another question - how feasible might this searching be? After playing with your split function a while, it seems to get down to 3ms per execution. But 3* around 3,000 records or more in the db is going to be at least 9 seconds on splitting alone. What are you thoughts on any way to speed something this large up? Have you dealt with phoenetic searhcing like this before on a bigger scale?

    Thanks for your help

    Tor

  13. #13
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    Quote Originally Posted by trobinson1 View Post
    Thanks, I didn't know that

    I've removed the begin/end and moved the ORDER BY to the caller, and still receive:

    Code:
    SQL State: 58004
    Vendor Code: -443
    Message: [SQL0443] CPD4350 error Cause . . . . . :   Either a trigger program, external procedure, or external function detected and returned an error to SQL. If the error occurred in a trigger program, the trigger was on table FN_SE00001 in schema CHTDEV. If the error occurred in an external procedure or function, the external name is FN_SE00001 in schema CHTDEV.  The associated text is CPD4350
    Does that error sound familiar to you or is this something specific on our system I should be asking the admin about?

    Thanks, and I apologize for my ignorance about DB2.
    You wrote "... removed the begin/end and moved the ORDER BY to the caller, ..."
    So, if not removed the begin/end and only moved the ORDER BY to the caller,
    what result did you got?


    ...
    ...
    Message: [SQL0443] CPD4350 error Cause . . . . . :
    The strangeness for me is the error message doesn't mention to SQL function nor SQL query itself.
    "... CPD4350 error ..." seems not the SQL error.


    ... If the error occurred in a trigger program, the trigger was on table FN_SE00001 in schema CHTDEV. The associated text is CPD4350
    So, I want to recommend you to see your "CREATE TABLE CHTDEV.FN_SE00001 ..." or "CREATE TRIGGER ... " on table CHTDEV.FN_SE00001.
    Or, copy and paste the CREATE FUNCTION statement of your function.


    Anyway,
    I doubt that the error message was related directly to my sample function.
    Did you tried my function and query?
    I doubt that.
    because you wrote that you made another function "CHTDEV.FN_SEARCH_COMPARE" which include my sample code.
    Something a little difference(for human being) might be a fatal difference in programming world.

    I want you to try my example first.
    Though, you are free to do anything you want.
    But, I don't want to help for your own(and apart from my example/recommendation) trial.
    Last edited by tonkuma; 05-28-12 at 18:33.

  14. #14
    Join Date
    May 2012
    Posts
    13
    Quote Originally Posted by tonkuma View Post
    You wrote "... removed the begin/end and moved the ORDER BY to the caller, ..."
    So, if not removed the begin/end and only moved the ORDER BY to the caller,
    what result did you got?



    The strangeness for me is the error message doesn't mention to SQL function nor SQL query itself.
    "... CPD4350 error ..." seems not the SQL error.



    So, I want to recommend you to see your "CREATE TABLE CHTDEV.FN_SE00001 ..." or "CREATE TRIGGER ... " on table CHTDEV.FN_SE00001.
    Or, copy and paste the CREATE FUNCTION statement of your function.


    Anyway,
    I doubt that the error message was related directly to my sample function.
    Did you tried my function and query?
    I doubt that.
    because you wrote that you made another function "CHTDEV.FN_SEARCH_COMPARE" which include my sample code.
    Something a little difference(for human being) might be a fatal difference in programming world.

    I want you to try my example first.
    Though, you are free to do anything you want.
    But, I don't want to help for your own(and apart from my example/recommendation) trial.
    Sorry for the late response, and I thank you for your help. Tests have shown this to be a bit too slow, and so I've thought of possibly having a seperate table that keeps pre-processed soundexed titles ("Some Thing" = "SXXX TXXX") that I will query from, I'll ignore DIFFERENCE because it's a bit too slow and would require a split, and do something like


    Code:
    select  [id column],  [title column], (
    		select count(word)
    		from TABLE(
    			CHTDEV.FN_SPLIT(lower('scaffold training'))
    			) as sentence 
    		where lower([title columns]) LIKE '%'||word||'%')
    	AS matches 
    FROM [my course table]  
    ORDER BY matches DESC, [id column] ASC
    FETCH FIRST 10 ROWS ONLY;
    I'll have to actually add soundexxing, but it seems quick enough to do whole word checks. After a soundex table is made that I'll query from, I'll be using "soundex(word)" rather than "word" or probably soundexxing the sentence table first and do it a maxiumum of [word count] times. I'll see how it goes.

    Thanks

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •