Results 1 to 4 of 4
  1. #1
    Join Date
    Jan 2009
    Posts
    2

    Unanswered: How do I optimize the folllowing query?

    I have implemented the following query to search the db for exact phrase (ex : if the search term is "air" should match the exact word not like hair).

    select A.*, B.*,C.GGCategoryID,C.CategoryName,C.CategorySlugNa me,XX.EcoStatus from tbl_products A,tbl_members B,tbl_product_categories C ,tbl_eco_ratings XX WHERE A.MerchantMemberID=B.MemberID and B.MemberType='Merchant' and A.GGCategoryID=C.GGCategoryID and XX.EcoStatusValue=A.EcoStatusValue and A.Status='Active' and C.Status='Active' and C.isDeleted='No' and ( (A.ProductName REGEXP '[[:<:]]#SEARCHTERM#[[:>:]]' OR A.ProductName REGEXP '[[:<:]]#SEARCHTERM#s[[:>:]]' ) OR (A.ProductDescription REGEXP '[[:<:]]#SEARCHTERM#[[:>:]]' OR A.ProductDescription REGEXP '[[:<:]]#SEARCHTERM#s[[:>:]]' ) OR (B.MemberNickName REGEXP '[[:<:]]#SEARCHTERM#[[:>:]]' OR B.MemberNickName REGEXP '[[:<:]]#SEARCHTERM#s[[:>:]]' ) OR (B.ScreenName REGEXP '[[:<:]]#SEARCHTERM#[[:>:]]' OR B.ScreenName REGEXP '[[:<:]]#SEARCHTERM#s[[:>:]]' ) OR (A.ProductType REGEXP '[[:<:]]#SEARCHTERM#[[:>:]]' OR A.ProductType REGEXP '[[:<:]]#SEARCHTERM#s[[:>:]]' ) OR (C.CategoryName REGEXP '[[:<:]]#SEARCHTERM#[[:>:]]' OR C.CategoryName REGEXP '[[:<:]]#SEARCHTERM#s[[:>:]]' ) ) and A.isDeleted = 'No' and B.MerchantType = 'Online' Group By A.`ProductName` Order By ProductID DESC LIMIT 0,10
    some info:

    i) table A has more than 100,000 records
    ii)I didn't use LIKE (%blahblah%) and MATCH AGAINST
    ii)Major columns are indexed in each table

    But I feel that , it has some disadvantages over here when using this above query (MAY BE it is REGEXP).

    I am not expert in MYSql,So you guys explain the disadvantages for the above query, what are the pitfalls , performance issues etc..

    Also It would be helpful if you guys tell me the alternate way of this above query..

    Regards!
    securegds

  2. #2
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    Quote Originally Posted by securegds
    I am not expert in MYSql,So you guys explain the disadvantages for the above query, what are the pitfalls , performance issues etc..
    Your code is unreadable which makes it difficult to improve (try formatting it and using the # button). Some explanation of the code might also be helpful. But anyway here are some helpful pointers (in no particular order):
    • All the ORs in your code means bad performance is inevitable.
    • Indexes work poorly on this type of query - you'll need to look at full text indexes.
    • You are limiting the records returned to 10, presumably hoping that this will produce a quicker response, but then you are ordering the result so all data has to be returned (in order to order the data) before taking the first 10.
    • You are not using the full features of REGEXP ie you could use A.ProductName REGEXP '[[:<:]]#SEARCHTERM#s?[[:>:]]' to optionally search for an s at the end of your string rather than inserting yet another OR each time.
    • You are effectively table scanning all your tables and then joining all your tables together - no wonder it's slow.
    is that enough pitfalls to keep you busy?

    Mike

  3. #3
    Join Date
    Jan 2009
    Posts
    2

    formatted QUERY

    PHP Code:
    select A.*, B.*,C.GGCategoryID,C.CategoryName,C.CategorySlugName,XX.EcoStatus from 
     tbl_products A
    ,tbl_members B,tbl_product_categories C ,tbl_eco_ratings XX 
     WHERE 
     A
    .MerchantMemberID=B.MemberID and 
     
    B.MemberType='Merchant' and 
     
    A.GGCategoryID=C.GGCategoryID and 
     
    XX.EcoStatusValue=A.EcoStatusValue and 
     
    A.Status='Active' and C.Status='Active' and 
     
    C.isDeleted='No' 
     
    and ( 
     (
    A.ProductName REGEXP '[[:<:]]#SEARCHTERM#[[:>:]]' OR A.ProductName REGEXP '[[:<:]]#SEARCHTERM#s[[:>:]]' ) OR 
     (
    A.ProductDescription REGEXP '[[:<:]]#SEARCHTERM#[[:>:]]' OR A.ProductDescription REGEXP '[[:<:]]#SEARCHTERM#s[[:>:]]' ) OR 
     (
    B.MemberNickName REGEXP '[[:<:]]#SEARCHTERM#[[:>:]]' OR 
      
    B.MemberNickName REGEXP '[[:<:]]#SEARCHTERM#s[[:>:]]' ) OR 
     (
    B.ScreenName REGEXP '[[:<:]]#SEARCHTERM#[[:>:]]' OR B.ScreenName REGEXP '[[:<:]]#SEARCHTERM#s[[:>:]]' ) OR
     (
    A.ProductType REGEXP '[[:<:]]#SEARCHTERM#[[:>:]]' OR A.ProductType REGEXP '[[:<:]]#SEARCHTERM#s[[:>:]]' ) OR 
     (
    C.CategoryName REGEXP '[[:<:]]#SEARCHTERM#[[:>:]]' OR C.CategoryName REGEXP '[[:<:]]#SEARCHTERM#s[[:>:]]' ) ) 
     and 
    A.isDeleted 'No' and B.MerchantType 'Online' Group By A.`ProductNameOrder By ProductID DESC LIMIT 0,10
     
     
    #SEARCHTERM# tends to "search keyword" here 

  4. #4
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    I've altered the code to format it better and to just have one regexp comparison (using s?) rather than 2 separate matches. It will still run slow because of the other points I mentioned but ...
    Code:
    select    A.*, 
              B.*,
              C.GGCategoryID,C.CategoryName,C.CategorySlugName,
              XX.EcoStatus
    from      tbl_products A,
              tbl_members B,
              tbl_product_categories C ,
              tbl_eco_ratings XX 
     WHERE    A.MerchantMemberID=B.MemberID
              and B.MemberType='Merchant' 
              and A.GGCategoryID=C.GGCategoryID 
              and XX.EcoStatusValue=A.EcoStatusValue 
              and A.Status='Active' 
              and A.isDeleted = 'No' 
              and B.MerchantType = 'Online' 
              and C.Status='Active' 
              and C.isDeleted='No' 
              and ( A.ProductName REGEXP '[[:<:]]#SEARCHTERM#s?[[:>:]]'
                    OR A.ProductDescription REGEXP '[[:<:]]#SEARCHTERM#s?[[:>:]]'
                    OR B.MemberNickName REGEXP '[[:<:]]#SEARCHTERM#s?[[:>:]]' 
                    OR B.ScreenName REGEXP '[[:<:]]#SEARCHTERM#s?[[:>:]]' 
                    OR A.ProductType REGEXP '[[:<:]]#SEARCHTERM#s?[[:>:]]' 
                    OR C.CategoryName REGEXP '[[:<:]]#SEARCHTERM#s?[[:>:]]'  ) 
    Group By A.`ProductName` 
    Order By ProductID DESC 
    LIMIT 0,10
    As well as my original points I'd also point out:
    • Using tbl_ to start each table name is a poor naming convention.
    • If a search string matches a category name then the user will always get everything from that category returned even if his search matches a product exactly. This will confuse most users as they'd never understand why they were given these items.
    • I can see what the code is doing now so please disregard the idea of full text searches I mentioned before.
    • You may have problems with matching upper and lower case depending on how your server is set up (easy to test though).
    • Also if the user searches for 2 words with a space in ie "Seagate drives" then it won't match against "Seagate hard drives".


    I think you should split the searches into separate searches: 1st to match against product names, 2nd product desc etc. If at any stage you have 10 records returned then you can stop searching further and just return the 10 matches. This should make the results a better match to what the user wants and it should be a lot quicker.

    You could also group the results showing the matches on product but then also saying that there are 2 matches on memberNickName and perhaps 1 on screenName and so giving the user the option of choosing these matches ie if the user searched for QNAP
    Code:
    Products:
      QNAP TS-409
      QNAP TS-209
      QNAP TS-209 Pro
    Manufactures:
      QNAP Store

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •