Results 1 to 8 of 8
  1. #1
    Join Date
    Aug 2003
    Posts
    111

    Unanswered: Query Constructed as a String

    Hi all

    I have the following store procedure that finds duplicates based on 3 conditions, names (there maybe multiple names), initiation_names (something that yoga people uses), and proofs of identity. the procedure will return a refcursor which returns a table of 5 columns, these columns are weighted and which includes id of the duplicate found ,the type of duplicate found as well as a similarity index showing how simiar the duplicates are.

    The procedure works fine. But we wanted more out of the procedure. Whenever a new identity(person) is entered into the database, a name or several names of this person is entered into the database. Our procedure is suppose to find out whether there is already a record of this person in the database, now because of spelling mistakes, these names duplicates may never be found based on names. For example, if the person's name is "james mcdonald" and if it's entered as "james mc donald" or "james mc'donald", the current procedure has no way of telling this because the checking is done as follows

    SELECT ....
    WHERE
    NAMES.FIRST_NAME IN (first_name_in)
    AND NAMES.LAST_NAME IN (last_name_in) ...

    There are 2 things worth mentioning, because i am working with a list of names rather than a single name, I had to use "IN", this also means the query needs to be constructed as a string before it is executed using

    open duplicates_out for the_sql_query_string;

    Does any one have any suggestion of how duplicate checking maybe done in such a way that it will be able to find duplicates even if there are minor spelling mistakes. we have tried soundex in oracle but that didn't work too well. The only other simple way that i can think of which doesn't cover spelling mistake in general checks for whether there are extra spaces or single quotes used. what i need to do essentially is to change the following

    SELECT ....
    WHERE
    NAMES.FIRST_NAME IN (first_name_in)
    AND NAMES.LAST_NAME IN (last_name_in) ...

    to something like

    SELECT ....
    WHERE
    replace(replace(NAMES.FIRSTNAME, ''', ''),' ', '') IN (first_name_in)
    AND replace(replace(NAMES.LAST_NAME, ''', ''),' ', '') IN (last_name_in) ...

    remember however because my query needs to be constructed as a string, when i have " replace(replace(NAMES.FIRSTNAME, ''', ''),' ', '') " which has single quotes, it breaks the string and the procedure won't compile. So the question is really, how do i construct a string that has single quotes in the string itself.

    Cheers
    James

    If you need to have a look at how i constructed my query string, it's as follows. The following procedure works.

    procedure find_duplicates(first_name_in in varchar2, last_name_in in varchar2, initiation_name_in in varchar2,
    proof_type_id_in in varchar2, proof_number_in in varchar2, duplicates_out out refcursor)
    as
    sqlcommand varchar2(4000);
    begin
    sqlcommand :=
    'select ID, sum (common_name) IsCNDuplicate, sum(initiation_name) IsINDuplicate, sum(proof) IsPRDuplicate,
    sum(common_name) + sum(initiation_name) + sum(proof) SimilarityIdx
    from
    (
    select distinct(IDENTITIES.ID), 1 common_name, 0 initiation_name, 0 proof
    from IDENTITIES, NAMES
    where NAMES.IDENTITIES_ID = IDENTITIES.ID
    AND NAMES.FIRST_NAME IN (' || first_name_in || ')
    AND NAMES.LAST_NAME IN (' || last_name_in || ')
    union
    select distinct(IDENTITIES.ID), 0 common_name, 2 initiation_name, 0 proof
    from IDENTITIES, INITIATIONS
    where INITIATIONS.IDENTITIES_ID = IDENTITIES.ID
    AND INITIATIONS.NAME IN (' || initiation_name_in || ')
    union
    select distinct(IDENTITIES.ID), 0 common_name, 0 initiation_name, 4 proof
    from IDENTITIES, ID_PROOF
    where ID_PROOF.IDENTITIES_ID = IDENTITIES.ID
    AND ID_PROOF.VERIF_NUMBER IN (' || proof_number_in || ')
    AND ID_PROOF.PROOF_TYPE_ID IN (' || proof_type_id_in || ')
    )
    group by ID
    order by SimilarityIdx desc';
    open duplicates_out for sqlcommand;
    end find_duplicates;

  2. #2
    Join Date
    Aug 2003
    Posts
    111
    I found an answer to part of my question

    Q: how to put single quotes into a string in oracle

    A: use 2 consecutive single quotes, e.g. 'abc''efg' will be interpretted as abc'efg


    I still want to know if there is a way to check for spelling mistakes using pl/sql

  3. #3
    Join Date
    Aug 2003
    Posts
    111
    So now i can put single quote in my query string, but when i have the following query, note that i need to replace single quote ' with nothing, so in my replace command,

    i have
    replace(NAMES.FIRST_NAME,'''''')

    which should be interpretted as
    replace(NAMES.FIRST_NAME,''')

    but it doesn't seem to be working and is returning me an missing parenthesis error.

    my new store procedure that compiles but which throws an exception upon execution


    sqlcommand :=
    'select ID, sum (common_name) IsCNDuplicate, sum(initiation_name) IsINDuplicate, sum(proof) IsPRDuplicate,
    sum(common_name) + sum(initiation_name) + sum(proof) SimilarityIdx
    from
    (
    select distinct(IDENTITIES.ID), 1 common_name, 0 initiation_name, 0 proof
    from IDENTITIES, NAMES
    where NAMES.IDENTITIES_ID = IDENTITIES.ID
    AND replace(NAMES.FIRST_NAME,'''''') IN (' || first_name_in || ')
    AND replace(NAMES.LAST_NAME,'''''') IN (' || last_name_in || ')
    union
    select distinct(IDENTITIES.ID), 0 common_name, 2 initiation_name, 0 proof
    from IDENTITIES, INITIATIONS
    where INITIATIONS.IDENTITIES_ID = IDENTITIES.ID
    AND INITIATIONS.NAME IN (' || initiation_name_in || ')
    union
    select distinct(IDENTITIES.ID), 0 common_name, 0 initiation_name, 4 proof
    from IDENTITIES, ID_PROOF
    where ID_PROOF.IDENTITIES_ID = IDENTITIES.ID
    AND ID_PROOF.VERIF_NUMBER IN (' || proof_number_in || ')
    AND ID_PROOF.PROOF_TYPE_ID IN (' || proof_type_id_in || ')
    )

  4. #4
    Join Date
    Oct 2003
    Location
    Pune
    Posts
    59
    Ur arguments for repalce are not right. Check that out
    You r passing two arguments
    try this out
    SELECT REPLACE('Cyrus','y',' ') FROM dual;

    cyrus

  5. #5
    Join Date
    Aug 2003
    Posts
    111
    you can do pass 2 arguments instead of 3, the third argument is optional, not passing an arguement simply tells the replace function to replace it with nothing ('').


  6. #6
    Join Date
    Aug 2003
    Posts
    111
    but i have to thank you cyrus
    you provided me a way of testing the replace function in sqlplus which i had to test through store procedure before, thank you.

    btw, i found my problem.

    essentially wat i needed to do was replacing single quote with nothing
    this being the first time i do this in oracle, i always thought i would have something like replace (names, '''), but the correct way of diong this is replace ( names, ''''), note the extra quote.

    now because i constructed my query using a string first, i needed to replace every single quote with 2 quotes. this means a total of 8 quotes consecutively, i had 6 quotes before that's why it failed.

    cheers
    thanx again cyrus

    that "select replace('cyrus', ' ', '') from dual" is a good general way to test pl/sql through sqlplus.

    James

  7. #7
    Join Date
    Aug 2003
    Posts
    111
    Hi all

    I would still like to know how spelling mistakes could be checked.

    Cheers
    James

  8. #8
    Join Date
    Nov 2002
    Location
    Desk, slightly south of keyboard
    Posts
    697
    Hi James,

    I think now that you can take out ' from your entered values, you should also be able to take out all punctuation, spaces etc. Essentially your input string should contain only chars in the range [A..Z]. Once you have achieved that, Soundex might be more reliable.

    Hth
    Bill

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •