Results 1 to 5 of 5
  1. #1
    Join Date
    Aug 2003
    Location
    Reston, VA
    Posts
    59

    Unanswered: Query Optimization

    Hello,

    I'm having a bit of difficulty running a 3 table JOIN query. It runs, but is awfully slow. I am using MySQL 4.0.20 running on a Quad processor, 2GB RAM, 1/2TB disk (1/4TB used) on RH Linux Advanced 3. The details are below....

    I have three tables. Objects, Collections_Objects and Collections. The relation is Objects -(one to many) -> Collections_objects <-(many to one)- Collections. I am using a FULLTEXT index on OBJECTS which is about 75 MB. Objects has ~ 145,000 records, Collections_Objects ~ 170,000 records and Collections ~ 2,500 records. All tables are MyISAM and all fields in my WHERE clause are indexed. Actually, I tried this with AND without and index on OBJECT_POSTING. That field has only four values in an enum...should this be indexed? Each term is under 15 charcaters long. Each table has been ANALYZED and check out okay.

    First, I do a Fulltext search on the Objects table alone. Works really good especially considering the size of the fulltext index. I searched a single term (non-boolean) in about .02 seconds and it returned 687 records.

    --query--
    SELECT OBJECTS.OBJECT_ID,
    MATCH ( OBJECTS.OBJECT_DESC, OBJECTS.OBJECT_ANNOTATION, OBJECTS.OBJECT_SEARCHABLE_TEXT, OBJECTS.OBJECT_TITLE )
    AGAINST ( 'mysearchterm' ) AS score
    FROM OBJECTS
    WHERE MATCH ( OBJECTS.OBJECT_DESC, OBJECTS.OBJECT_ANNOTATION, OBJECTS.OBJECT_SEARCHABLE_TEXT, OBJECTS.OBJECT_TITLE )
    AGAINST ( 'mysearchterm' ) AND ( OBJECTS.STATUS_ID = '2' || OBJECTS.STATUS_ID = '4' ) AND OBJECTS.OBJECT_POSTING = 'yes'
    --end query--
    --explain query--
    +---------+----------+---------------------------------------+--------------+---------+------+------+-------------+
    | table | type | possible_keys | key | key_len | ref | rows | Extra |
    +---------+----------+---------------------------------------+--------------+---------+------+------+-------------+
    | OBJECTS | fulltext | STATUS_ID,OBJECT_POSTING,OBJECT_TITLE | OBJECT_TITLE | 0 | | 1 | Using where |
    +---------+----------+---------------------------------------+--------------+---------+------+------+-------------+

    --end explain query--


    Now, I try to search only under one of my collections and it takes a whopping 39 seconds (give or take) to return 330 records!!!

    --QUERY--
    SELECT OBJECTS.OBJECT_ID,
    MATCH (
    OBJECTS.OBJECT_DESC, OBJECTS.OBJECT_ANNOTATION, OBJECTS.OBJECT_SEARCHABLE_TEXT, OBJECTS.OBJECT_TITLE
    )
    AGAINST (
    'mysearchterm'
    ) AS score
    FROM (
    OBJECTS
    INNER JOIN COLLECTIONS_OBJECTS ON OBJECTS.OBJECT_ID = COLLECTIONS_OBJECTS.OBJECT_ID
    )
    INNER JOIN COLLECTIONS ON COLLECTIONS_OBJECTS.COLLECTION_ID = COLLECTIONS.COLLECTION_ID
    WHERE MATCH (
    OBJECTS.OBJECT_DESC, OBJECTS.OBJECT_ANNOTATION, OBJECTS.OBJECT_SEARCHABLE_TEXT, OBJECTS.OBJECT_TITLE
    )
    AGAINST (
    'mysearchterm'
    ) AND (
    OBJECTS.STATUS_ID = '2' || OBJECTS.STATUS_ID = '4'
    ) AND OBJECTS.OBJECT_POSTING = 'yes' AND COLLECTIONS.COLLECTION_ID = '2426'
    --END Query --

    --Explain Query --
    +---------------------+----------+-----------------------------------------------+---------------+---------+-------+------+-------------+
    | table | type | possible_keys | key | key_len | ref | rows | Extra |
    +---------------------+----------+-----------------------------------------------+---------------+---------+-------+------+-------------+
    | COLLECTIONS | const | PRIMARY | PRIMARY | 4 | const | 1 | |
    | OBJECTS | fulltext | PRIMARY,STATUS_ID,OBJECT_POSTING,OBJECT_TITLE | OBJECT_TITLE | 0 | | 1 | Using where |
    | COLLECTIONS_OBJECTS | ref | OBJECT_ID,COLLECTION_ID | COLLECTION_ID | 4 | const | 12 | Using where |
    +---------------------+----------+-----------------------------------------------+---------------+---------+-------+------+-------------+
    --End Explain Query--



    Any help is much appreciated. I'm not really sure what I'm doing wrong here. Is it my SQL? Is it my indexing? Or, is this one of the 'complex queries' that MySQL doesn't run as fast as PostgreSQL (although it doesn't appear to be complex IMO) that the PostgreSQL advocates talk about. Oh, and bonus points if there is a way to do this recursively (Collections is a self-referencing table - i've only done this type of stuff through recursive code since sub-selects and tree structure support is not in 4.0.20).

  2. #2
    Join Date
    Aug 2004
    Location
    France
    Posts
    754
    Hello,

    Wouldn't this query be simpler ? I simply added the bold part to your first query.

    Code:
    SELECT OBJECTS.OBJECT_ID,
    MATCH ( OBJECTS.OBJECT_DESC, OBJECTS.OBJECT_ANNOTATION, OBJECTS.OBJECT_SEARCHABLE_TEXT, OBJECTS.OBJECT_TITLE )
    AGAINST ( 'mysearchterm' ) AS score
    FROM OBJECTS, COLLECTIONS_OBJECTS, COLLECTIONS
    WHERE 
    MATCH ( OBJECTS.OBJECT_DESC, OBJECTS.OBJECT_ANNOTATION, OBJECTS.OBJECT_SEARCHABLE_TEXT, OBJECTS.OBJECT_TITLE )
    AGAINST ( 'mysearchterm' ) 
    AND ( OBJECTS.STATUS_ID = '2' || OBJECTS.STATUS_ID = '4' ) 
    AND OBJECTS.OBJECT_POSTING = 'yes'
    AND  OBJECTS.OBJECT_ID = COLLECTIONS_OBJECTS.OBJECT_ID
    AND COLLECTIONS_OBJECTS.COLLECTION_ID = COLLECTIONS.COLLECTION_ID;
    I personally prefer, for Inner Joins, to use this syntax.

    Anyway, it is normal that performing two joins slows down your query. Having indexes on the columns on which the joins are done generally helps, but I think it should be the case here, as these columns look like primary keys.

    Tell me if this query performs better.

    Regards,

    RBARAER

  3. #3
    Join Date
    Aug 2003
    Location
    Reston, VA
    Posts
    59

    Thanks, but...

    .....40 seconds again! Funny, I usually right my standard INNER JOINS the same way, but figured I would try the INNER JOIN syntax to see if it sped things up.

    Anyway, what I actually did was simply take out the COLLECTIONS_OBJECTS.COLLECTION_ID = COLLECTIONS.COLLECTION_ID relation and changed COLLECTIONS.COLLECTION_ID='###' to COLLECTIONS_OBJECTS.COLLECTION_ID='###'....while taking out FROM COLLECTIONS......it worked MUCH quicker!!!! .03 seconds actually. Is this, however, bad query form to leave out the additional relation ship??? Is this a GOOD fix or a workaround? Well, seems to work so I guess I'll go with it for now.

    Follow up is still encouraged...

    Thanks again!

  4. #4
    Join Date
    Aug 2004
    Location
    France
    Posts
    754
    Well done !

    Of course you don't need the second join as you already have COLLECTION_ID in your COLLECTIONS_OBJECTS table... I think I should buy new glasses...

    This is a good query for what you are looking for. BTW, sorry, I just saw I forgot the criteria on COLLECTION_ID='###' in the query I gave you... I think I would have realized my error if I had written it (I hope so, at least, or maybe I don't need glasses, but a brain ). You will notice that writing joins as I did, and as you usually do, it seems, is much clearer than with INNER JOIN. At least, it is my opinion.

    You would only have needed the second join if you had needed a field only present in your COLLECTIONS table.

    Regards,

    RBARAER

  5. #5
    Join Date
    Aug 2003
    Location
    Reston, VA
    Posts
    59

    Thanks!

    Yes, there really is no reason to have the second join if nothing is pulling from that table. Yes, I also agree this seems to be a "cleaner" way to write queries unless you are using a LEFT or RIGHT JOIN...maybe even using NATURAL JOIN (just found that one).

    Anyway, thanks again for lending your advice. Have a great day!

    DrS

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •