Results 1 to 5 of 5
  1. #1
    Join Date
    Apr 2002
    Posts
    84

    Question Unanswered: index scan vs seq scan

    i have a large table, more 200k records and with a "keyword"
    field of varchar(3000).

    i've built an index on this field, but my sql is something like

    SELECT * FROM TABLE WHERE upper(keyword) like '%CAR%'

    so it is using seq scan instead of index scan, taking about
    10 seconds to get the result. a bit unacceptable.

    is there anyway to improve the performance?

    thanks.

  2. #2
    Join Date
    Sep 2003
    Location
    NorthWest
    Posts
    10
    I have this problem too, with a field "description" of varchar(255). The search is maded with LIKE also, and takes some time.

    I also know that this can be improved, but don't know what type of indexing should I use.

    Can someone bring a suggestion? Thanks
    Last edited by FlashMagnuM; 09-19-03 at 04:36.
    The more people I now, the more I love my dog.

  3. #3
    Join Date
    Sep 2003
    Location
    Wisconsin, USA
    Posts
    34

    ummm

    AFAIK, there is no way to speed up a LIKE.

    think of this ... what if it was a number? if you did LIKE '%10%' then the database would have to match:
    10
    100
    210
    abc10ount
    500.10
    350861038563

    etc ....

    So it has to go through the data of each row & do a text search on each field! (also known as SEQ SCAN)

    There are three main ways to speed this up ...

    The main one is don't use text fields as your primary key. You should use the serial type whenever humanly possible. If you are frequently searching for a text value, chances are you need to re-design your database.

    For example, if you have product codes with CAR in it, create a seperate table called product_categories, with a category_id and category_name ... then, in the same table along with your full product code (maybe '03CAR_MAZDA') you can have a category_id column that references the product_categories table. Then you can "SELECT * FROM products WHERE category_id = Category_id_for_CAR". (the Category_id_for_CAR would be a number, like 2, that references product_categories.)

    Notice that we're now using "=" instead of "LIKE", which can utilize indexes and should be MUCH MUCH faster!

    The second is to distribute your data across multiple tables (known as normalizing.) Try to have a reasonable number of columns in each table. As a general rule of thumb, if you have more than 10 columns in a table, your data can probably be better normalized. (Don't flame me on this! This is a rule of thumb only.)

    Finally, if you can break your data into multiple tables (say, put old records in an different table. perhaps do this with a new table every three months/weeks/days, for example), it doesn't have to SEQ SCAN as much data.


    Hope this helps a little.

  4. #4
    Join Date
    Apr 2002
    Posts
    84

    algorithm...

    i have a table, prod, with fields of id, name, keyword and date.

    my program is doing something like this:

    1. use a search engine to search for keyword, and a vector
    of id is returned.

    2. connec to db to get all the prod records one by one, given
    the vector of id.

    3. sort the prod records by date and then name.

    the most time consuming part is Step 2 and Step3. that is why
    i am thinking of using SQL with LIKE clause to replace the steps.
    however, it is still a bit slow due to LIKE clause.

    is there any solution for it?

    thanks.

  5. #5
    Join Date
    Sep 2003
    Location
    NorthWest
    Posts
    10
    moku is right.

    I read a lot about the SEQ SCAN in the last days, and it seams that is not posibile to speed up this operation, because it not use indexes at all.
    The more people I now, the more I love my dog.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •