Results 1 to 4 of 4
  1. #1
    Join Date
    Dec 2011
    Location
    Budapest
    Posts
    2

    Unanswered: Help with optimizing query

    Hi,

    I have a relatively complex query for the following situation. There are trucks with coupling(s) join them (1 or more). There are routes(transports) with one or more fuel bills on a route. The db schema simplified looks the following:

    Routes:
    -> id
    -> start_date
    -> arrival_date
    -> start_km
    -> end_km

    Couplings:
    -> id
    -> plate_number

    Routes_Couplings: (Routes,Couplings join table, n:m relation):
    -> route_id
    -> coupling_id

    Fuels:
    -> id
    -> buy_date
    -> amount

    Routes_Fuels: (Routes,Fuels join table, n:m relations):
    -> route_id
    -> fuel_id

    I had to make a select with the following columns whit in:
    -> Couplings.plate_number (grouped by)
    -> All taken fuels SUM(Fuels.amount)
    -> The routes distance (Routes.end_km - Routes.start_km)

    I came to the following result:
    PHP Code:
    SELECT 
         cp
    .plate_number,
                ( 
                   
    SELECT 
                        SUM
    (f.amount)
                   
    FROM
                        Routes_Fuels rf ON rf
    .route_id r.id
                        INNER JOIN Fuels f ON rf
    .fuel_id f.id
                   GROUP BY
                        f
    .amount
              
    ) as fuelsum
         
    ,(r.end_km-r.start_km) as distance
    FROM
         Couplings cp
              INNER JOIN Routes_Couplings rc ON rc
    .couple_id cp.id
              INNER JOIN Routes r ON rc
    .route_id r.id
    WHERE
         r
    .start_date like "2011-01%"
    GROUP BY
         cp
    .plate_number 
    Is there a more optimized way? I know this is not ANSI SQL compatible, would prefer that as well.

    thanks
    Sandor

  2. #2
    Join Date
    Sep 2009
    Location
    San Sebastian, Spain
    Posts
    880
    Firstly, are you storing the start_date as text? Is there an index on this field? If there is I am not sure whether this will be picked up with this WHERE clause. If you do an explain on this select statement we can verify this. If it does not pick it up then use r.start_date >= '2011-01-01' and r.start_date < '2011-02' which should use the index. Also adding in the GROUP BY cp.plate_number has no impact and can be removed.
    Ronan Cashell
    Certified Oracle DBA/Certified MySQL Expert (DBA & Cluster DBA)
    http://www.it-iss.com
    Follow me on Twitter

  3. #3
    Join Date
    Dec 2011
    Location
    Budapest
    Posts
    2

    theoretical optimizing of a query for ANSI SQL

    thanks for the hints,

    my question regarding "more optimized way" was if it can be overwritten to support ANSI SQL for a potential db migration? I was using mysql until now and want to extend my knowledge regarding ANSI SQL.

    I didn't wrote an exact schema yesterday just because my statement above.
    Of course as I wrote it here (mysql section) because querying datetime columns this way in mysql is ok (based on my experience) the ANSI way would be using year and month functions (I think both very common across db platforms, even not ,I suppose, ANSI supported functions).

    Optimizing queries through good indexes is a must, which I know so my questions are more theoretical.

    Right now I got a new theoretical version of the schema which is the following:
    (the n:m relations is between Routes and Fuels has been changed to 1:n relations, 1 Route contain 0 or more fuels taken)

    Routes:
    -> id
    -> start_date
    -> arrival_date
    -> start_km
    -> end_km

    Couplings:
    -> id
    -> plate_number

    Routes_Couplings: (Routes,Couplings join table, n:m relation):
    -> route_id
    -> coupling_id

    Fuels: (1:n relation to Routes)
    -> id
    -> route_id
    -> buy_date
    -> amount
    And my current theoretical query look like this:
    PHP Code:
    SELECT  
         cp
    .plate_number
                (  
                   
    SELECT  
                        SUM
    (f.amount
                   
    FROM 
                        Fuels f
                   WHERE
                        f
    .route_id r.id 
                   GROUP BY 
                        f
    .route_id 
              
    ) as fuelsum 
         
    ,(r.end_km-r.start_km) as distance 
    FROM 
         Couplings cp 
              INNER JOIN Routes_Couplings rc ON rc
    .couple_id cp.id 
              INNER JOIN Routes r ON rc
    .route_id r.id 
    WHERE 
         YEAR
    (r.start_date) = 2011 AND MONTH(f.start_date) = 1
    GROUP BY 
         cp
    .plate_number 
    I know this is still wrong as of a coupling can be part of more than one routes and a route can contain more couplings as well. But I keep working on it.

    thanks
    Sandor

    sorry for my english if I was misleading in any way

  4. #4
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Quote Originally Posted by carstep View Post
    Code:
    WHERE
         r.start_date like "2011-01%"
    to make this ANSI compatible, and give the optimizer a chance to use an index on the start_date column (which it cannot do if you apply YEAR() and MONTH() functions), write it this way --
    Code:
    WHERE r.start_date >= '2011-01-01'
      AND r.start_date  < '2011-02-01'
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •