Results 1 to 7 of 7
  1. #1
    Join Date
    Oct 2005
    Posts
    25

    Unanswered: Parse a 255 Char Text Field for Multiple Strings

    I have a table which stores the results of a software scan in a table which contains a 255 character field named path. On ocassion, the file search returns a hit from a location which is invalid, i.e., any path which contains the text string "Transfer" or "Old C". When I am ready to retrieve the data in my select statement, I want to skip those entries.

    I am a 15 year veteran of Progress 4GL, but, novice to SQL. I know that searching character fields for strings is cumbersome, but, this is the legacy code I inherited:

    and upper(s.path) not like '%OLD C%'
    and upper(s.path) not like '%OLD D%'
    and upper(s.path) not like '%TRANSFER%'

    That looks extremely inefficient to me, but, I'm new.

    I used to do a Progress 4GL statement like

    IF INDEX(ENTRY(UPPER("TRANSFER","OLD C","OLD D")),s.path) NE 0 THEN NEXT.

    Something is missing in my Progress code, but, that's not important. I need an efficient select search.

    Thanks,

  2. #2
    Join Date
    Dec 2003
    Location
    Oklahoma, USA
    Posts
    354
    You might try something like this:

    Code:
       and INSTR( upper(s.path), 'OLD C' ) = 0
       and INSTR( upper(s.path), 'OLD D' ) = 0
       and INSTR( upper(s.path), 'TRANSFER' ) = 0
    It runs much faster for us. The faster part (I believe) is that INSTR returns the first instance of the search string. Once it finds a match, a value is returned. The LIKE statement searches the entire string every time.

    In all of our tests, we have also found that the explain plan for this type of statement runs much better. Not really sure why, so if anyone would care to comment on this, I'd be thrilled to know more...
    JoeB
    save disk space, use smaller fonts

  3. #3
    Join Date
    Oct 2005
    Posts
    25
    So, looking at your sample, would I be correct in my logic that the following would run even a little faster:

    AND NOT ( INSTR(UPPER(s.path), 'TRANSFER') <> 0
    OR INSTR(UPPER(s.path), 'OLD C' ) <> 0
    OR INSTR(UPPER(s.path), 'OLD D' ) <> 0 )

    Because the entire statement will be FALSE if any one of teh inner statements evaluates as FALSE, therefore, if the string 'TRANSFER' is found in the path, there is no need to continue testing for 'OLD C' and 'OLD D'. Is that correct?

  4. #4
    Join Date
    Dec 2003
    Location
    Oklahoma, USA
    Posts
    354
    I know Oracle uses short-cut logic (if multiple AND conditions exist, no need to check the rest of the conditions if any of the previous ones fail, etc) for AND conditions. I'm not sure if the same is true for OR's. What I am saying is if there is a WHERE clause that evaluates to "NOT (TRUE OR FALSE OR FALSE OR FALSE )" Oracle might check all conditions before determining a true or false condition. It would be interesting to test...

    One other thing to try is depending on the data, you will likely get better results depending on which strings are most common. For instance, if 'TRANSFER' occurs more often, put the INSTR( upper(s.path) ) = 0 first in the WHERE clause.
    JoeB
    save disk space, use smaller fonts

  5. #5
    Join Date
    Dec 2003
    Location
    Oklahoma, USA
    Posts
    354
    OK... learned something new... I think.

    I tried the following two queries on my own system (the HISTORY table has a list of web site hits according to browser agent), with my data and got the following results:

    Code:
    SELECT *
      FROM history
     WHERE NOT (INSTR (browser, 'MSIE 6.0') <> 0
         OR INSTR (browser, 'MSIE 5.5') <> 0)
    Code:
    SELECT *
      FROM history
     WHERE INSTR (browser, 'MSIE 6.0') = 0
         OR INSTR (browser, 'MSIE 5.5') = 0
    Same result set. However, first query, styled after your second question, has an explain plan that queries only 103 rows, cost of 1405. The second query has to scan 20K rows, but the same cost of 1405. Looks like enclosing the conditions inside another logical operator (NOT) enforces the short-cut logic and therefore makes the where clause process faster...
    JoeB
    save disk space, use smaller fonts

  6. #6
    Join Date
    Oct 2005
    Posts
    25

    Thanks for your help

    I can see the command processes much faster (just because i am sitting here waiting and watching). In the grand scheme of things, however, I think my big problem is volume. I have to search one table which is 57 million records.

    I started another thread which expanded on this question, and was told to do and EXPLAIN_PLAN.

    Here is my problem. I've never played with Oracle or straight SQL. I'm a PROGRESS 4GL Programmer. I know how to do what I want in PROGRESS, but I can't translate it to SQL. Furthermore, I am writing in an editor called SQL Plus Worksheet which is not helpful at all. I don't have any tools. And I certainly don't understand some of the evaluation tools available, or even if I have them available. Wah!!!!! Sorry. I'm a two week old extreme novice.

    That is the crux of the issue. But this has been helpful. I need to find a pocket referrence.

    I am going to start another thread which is closely related to this one.

  7. #7
    Join Date
    Dec 2003
    Location
    Oklahoma, USA
    Posts
    354
    A tool that could help is called TOAD... It is a procedure editor (any many other features such as DBA stuff) but it might help you as the explain plan is simply another tab to click on after you have submitted your query.

    Appreciate your newness... was there at one time too. However, you seem to be doing well and have a grasp of Oracle that many people have to wait years to attain to.

    Best of luck.
    JoeB
    save disk space, use smaller fonts

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •