Results 1 to 3 of 3
  1. #1
    Join Date
    Dec 2004
    Posts
    3

    Unanswered: Help with Oracle Text

    Hi

    I'm using Oracle 9i. I created a teble:

    CREATE TABLE my_docs (
    id NUMBER(10) NOT NULL,
    name VARCHAR2(200) NOT NULL,
    doc BLOB NOT NULL
    )
    /

    ALTER TABLE my_docs ADD (
    CONSTRAINT my_docs_pk PRIMARY KEY (id)
    )
    /

    Then I loaded some pdf, doc and html files using Java and JDBC.

    Then I created an index:
    CREATE INDEX my_docs_doc_idx ON my_docs(doc) INDEXTYPE IS CTXSYS.CONTEXT;
    EXEC DBMS_STATS.GATHER_TABLE_STATS('oratext', 'my_docs', cascade=>TRUE);

    And when I wanted to find a file that contains a word 'car' using query:
    select id, name from my_docs where contains(doc, 'car')>0;
    or
    select id, name from my_docs where contains(doc, '%car%')>0;

    I've got result:
    "no rows selected"

    I've got in table my_docs a row which contains pdf document with "car" word. So why I receive "no rows selected"???????????????


    Marek

  2. #2
    Join Date
    Aug 2004
    Location
    France
    Posts
    754
    Hello MarekNow,

    I'm not sure, but I think I know where your problem comes from. When you create a DOMAIN index, there is a FILTER parameter, which by default is NULL (corresponding to plain text). If you want to index complex structured documents (such as .doc or .pdf), AFAIK, you must specify CTXSYS.INSO_FILTER for this parameter, like this :
    Code:
    CREATE INDEX my_docs_doc_idx ON my_docs(doc) 
    INDEXTYPE IS CTXSYS.CONTEXT 
    parameters ('filter ctxsys.inso_filter');
    If you want more info, please check the manual here part 2 "Indexing", section "Filter Types".

    I've never tried this parameter for I've never needed it yet, but I think it should work in your situation.

    HTH & Regards,

    RBARAER
    Last edited by RBARAER; 12-03-04 at 08:51.

  3. #3
    Join Date
    Dec 2004
    Posts
    3

    Oracle Text

    Quote Originally Posted by RBARAER
    Hello MarekNow,

    I'm not sure, but I think I know where your problem comes from. When you create a DOMAIN index, there is a FILTER parameter, which by default is NULL (corresponding to plain text). If you want to index complex structured documents (such as .doc or .pdf), AFAIK, you must specify CTXSYS.INSO_FILTER for this parameter, like this :
    Code:
    CREATE INDEX my_docs_doc_idx ON my_docs(doc) 
    INDEXTYPE IS CTXSYS.CONTEXT 
    parameters ('filter ctxsys.inso_filter');
    If you want more info, please check the manual here part 2 "Indexing", section "Filter Types".

    I've never tried this parameter for I've never needed it yet, but I think it should work in your situation.

    HTH & Regards,

    RBARAER
    Hi

    Thank you for your reply. The problem was, that the database I was using, had no Oracle Text installed. I was able to make index on my BLOB fields (containing pdf document), but that index wasn't much usable.

    Marek

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •