Results 1 to 8 of 8
  1. #1
    Join Date
    Mar 2008
    Posts
    7

    Unanswered: Text Indexing Help

    I am wondering about text indexing as a solution to a problem I have. Basically I have 28 days of data, partitioned into day tables.

    Each day has approx 450,000,000 rows (so thats 12.6 billion for all 28 days) and I have a varchar2 column in each row which is a comma delimited string of ids (sorted), like this for example:

    a;b;c;
    a;b;
    b;c;
    c;d;e;

    I want the end user to be able to provide IDs for searching (via front end web page) and to then build out a dynamic sql to fetch the count of rows containing give ID combinations.

    So if they ask for id 'a' it would return 2, ids 'a' and 'b' = 3, etc. Without indexing I can just do

    SELECT count(*) from TABLE where column like '%a%';
    SELECT count(*) from TABLE where column like '%a%b%';

    My guess is this going to be ultra slow and I wonder is text indexing will help and how to implement it properly? Bear in mind I have a few thousand ids in the system, so would I have to create a few thousand indexes? (and is that even feasible?).

    Failing indexes, any other ideas anybody? I was figuring as an alternative to just possibly create a new table and extract the IDs into it and then query by joining the two tables on a common row id value....

    Thanks in advance.

  2. #2
    Join Date
    Aug 2003
    Location
    Where the Surf Meets the Turf @Del Mar, CA
    Posts
    7,776
    Provided Answers: 1
    >SELECT count(*) from TABLE where column like '%
    I suspect the SQL similar to the 1 above will always do a FTS on TABLE, because leading wildcard precludes use of index.
    You can lead some folks to knowledge, but you can not make them think.
    The average person thinks he's above average!
    For most folks, they don't know, what they don't know.
    Good judgement comes from experience. Experience comes from bad judgement.

  3. #3
    Join Date
    Mar 2008
    Posts
    7
    Right. Is that the case with both regex_like indexes and also ctxcat indexes?

    If so, what alternative could one do?

  4. #4
    Join Date
    Aug 2003
    Location
    Where the Surf Meets the Turf @Del Mar, CA
    Posts
    7,776
    Provided Answers: 1
    >If so, what alternative could one do?
    You are asking, "how to make a pig fly?".
    Good Luck!
    You can lead some folks to knowledge, but you can not make them think.
    The average person thinks he's above average!
    For most folks, they don't know, what they don't know.
    Good judgement comes from experience. Experience comes from bad judgement.

  5. #5
    Join Date
    Mar 2008
    Posts
    7
    What kind of comment is that? I guess companies like Google make pigs fly then huh?

  6. #6
    Join Date
    Aug 2003
    Location
    Where the Surf Meets the Turf @Del Mar, CA
    Posts
    7,776
    Provided Answers: 1
    >I guess companies like Google make pigs fly then huh?
    You are making 2 unwarranted leaps in logic.
    1) that Google use SQL
    2) Even if SQL is used, that it contains '%ID10T'

    One needs to be smarter than the equipement he is (ab)using.
    You can lead some folks to knowledge, but you can not make them think.
    The average person thinks he's above average!
    For most folks, they don't know, what they don't know.
    Good judgement comes from experience. Experience comes from bad judgement.

  7. #7
    Join Date
    Mar 2008
    Posts
    7
    Not really. I am simply assuming that Oracle is an enterprise level database capable of indexing large amounts of text data. As somebody without a large amount experience in this domain (unlike you I wasn't born with this knowledge), I am also assuming this forum contains some knowledgeable engineers with real world experience in such a problem who might be able to simply offer some pertinent advice...

    ...but somehow there registration page let you into the system. It seems all that you actually know is how to be rude and how to solve a problem by saying it can't be done instead of actually thinking about how it *can* be done.

    I seriously do hope you don't do this stuff for living, I feel sorry for your boss and co-workers if you do.

    If you were actually smarter than the equipment you are using (as you suggest), then you might be able to tell less experienced users how to use it too, but I guess you really don't know how to use it all, hence why you troll around here all day instead.

  8. #8
    Join Date
    Jun 2004
    Location
    Liverpool, NY USA
    Posts
    2,509
    I suspect what they are saying is that instead of a varchar2 string containing flags, have a child table that would contain a link back to the parent and a single index row containing every flag. If you have 20 flags, you would have 20 rows.
    Bill
    You do not need a parachute to skydive. You only need a parachute to skydive twice.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •