Results 1 to 12 of 12
  1. #1
    Join Date
    May 2006
    Posts
    35

    Unanswered: Help with Optimizing SQL Query (Moving a WHERE Statement)

    The following SQL Query runs much too slow for my purposes.

    Code:
    SELECT tblDt.Dt, tblGx.GID, tblHydroProfiles.Capacity, tblGx.Cost
    FROM tblGx INNER JOIN (tblDt INNER JOIN tblHydroProfiles ON (tblDt.Hour=tblHydroProfiles.Hour) AND (tblDt.DOW=tblHydroProfiles.DOW) AND (tblDt.Month=tblHydroProfiles.Month)) ON tblGx.ProfileID=tblHydroProfiles.HydroProfileID
    WHERE (((tblGx.InServiceDate)<=tblDt!Dt) And ((tblGx.OutServiceDate)>tblDt!Dt)) And (tblDt.Dt)>=#1/1/2007# And (tblDt.Dt)<=#1/1/2008# And (tblGx.ResID)=1 Or (tblGx.ResID)=8
    ORDER BY Dt, GID;
    Since I am just learning SQL, I am sure that there are many ways to optimize the code.

    To start, I would like to move the second las line:

    Code:
    (tblGx.ResID)=1 Or (tblGx.ResID)=8
    To the second line so that it is executed before the join (since removing it entirely speeds up operation):
    Code:
    FROM tblGx INNER JOIN (tblDt INNER JOIN tblHydroProfiles ON
    Is this possible?

  2. #2
    Join Date
    Sep 2003
    Location
    MI
    Posts
    3,713
    Nope. Those are conditions ... They have to stay there. You can optimise them or reorder them.

    On that subject: This part of your condition can be optimized ...

    (tblDt.Dt)>=#1/1/2007# And (tblDt.Dt)<=#1/1/2008#

    TO: (tblDt.Dt BETWEEN #1/1/2007# AND #1/1/2008#)

    As for your speed up: Do you really need those last 2 conditions?
    Back to Access ... ADO is not the way to go for speed ...

  3. #3
    Join Date
    May 2006
    Posts
    35
    Well, the complete SQL statement is this:

    Code:
    SELECT DEV.Dt AS Dt, tblGx.GID AS GID, IIf(tblOutages!GID Is Null,Installed_Capacity,0) AS Actual_Capacity, tblGx.Cost AS Cost
    FROM [
    
    SELECT tblDt.Dt, tblGx.GID, tblGx.Installed_Capacity,  tblGx.Cost FROM tblDt, tblGx 
    WHERE Dt >=#1/1/2007# And Dt <#1/1/2008# AND tblGx.ResID <> 1 AND tblGx.ResID <> 2 AND tblGx.ResID <> 8
    UNION SELECT tblDt.Dt, tblGx.GID, tblHydroProfiles.Capacity, tblGx.Cost
    FROM tblGx INNER JOIN (tblDt INNER JOIN tblHydroProfiles ON (tblDt.Hour=tblHydroProfiles.Hour) AND (tblDt.DOW=tblHydroProfiles.DOW) AND (tblDt.Month=tblHydroProfiles.Month)) ON tblGx.ProfileID=tblHydroProfiles.HydroProfileID
    WHERE (((tblGx.InServiceDate)<=tblDt!Dt) And ((tblGx.OutServiceDate)>tblDt!Dt)) AND (tblDt.Dt)>=#1/1/2007# And (tblDt.Dt)<=#1/1/2008# AND (tblGx.ResID)=1 OR  (tblGx.ResID)=8 
    UNION SELECT
    tblDt.Dt, tblGx.GID, tblGx.Installed_Capacity*tblWindProfile.CapFac, tblGx.Cost 
    FROM tblDt INNER JOIN (tblGx INNER JOIN tblWindProfile ON tblGx.ProfileID=tblWindProfile.WindProfileID) ON (tblDt.Month=tblWindProfile.Month) AND (tblDt.Hour=tblWindProfile.Hour)
    WHERE (((tblGx.InServiceDate)<=tblDt!Dt) And ((tblGx.OutServiceDate)>tblDt!Dt)) AND
    (tblDt.Dt)>=#1/1/2007# And (tblDt.Dt)<=#1/1/2008# AND (tblGx.ResID)=2
    ]. AS DEV LEFT JOIN tblOutages ON (DEV.GID = tblOutages.GID) AND (DEV.Dt >= tblOutages.StartDate) AND (DEV.Dt <= tblOutages.EndDate)
    ORDER BY Dt, Cost;

    As you can see, I am treating different ResIDs differently, and as a result, I believe that I need to use them (although they are slowing the query down significantly). I will see if the BETWEEN statement helps though.

    EDIT: I should however be able to remove them if I change ProfileID to be distinct regardless of the ResID (right now, the same ProfileID can be for Wind or Hydro and be used for two different things, but this is not necessary)
    Last edited by Nai; 05-05-06 at 19:05.

  4. #4
    Join Date
    Sep 2003
    Location
    MI
    Posts
    3,713
    Nothing like changing the meaning of the conversation AFTER the fact ... Having a UNION query changes everything ...
    Back to Access ... ADO is not the way to go for speed ...

  5. #5
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    Um... not sure about the your second post but your query looks wrong to me (correct me if I'm wrong - damn Accesses insistence on obsuctory brackets).

    Maybe:

    Code:
    SELECT tblDt.Dt, tblGx.GID, tblHydroProfiles.Capacity, tblGx.Cost
    FROM tblGx INNER JOIN (tblDt INNER JOIN tblHydroProfiles ON (tblDt.Hour=tblHydroProfiles.Hour) AND (tblDt.DOW=tblHydroProfiles.DOW) AND (tblDt.Month=tblHydroProfiles.Month)) ON tblGx.ProfileID=tblHydroProfiles.HydroProfileID
    WHERE tblGx.InServiceDate <=tblDt!Dt And tblGx.OutServiceDate >tblDt!Dt And tblDt.Dt >=#1/1/2007# And tblDt.Dt <=#1/1/2008# And (tblGx.ResID=1 Or tblGx.ResID=8)
    ORDER BY Dt, GID
    You were stating that you wanted all the date conditions to match as well as ResID to equal 1 or ResID to equal 8 (i.e. regardless of dates). I've removed all unnecessary barckets and added one important set.

    I don't know if BETWEEN is actually an optimisation - I know for a fact that SQL Server does not differentiate the two. Dunno about JET - totally different engine. It is neater code though so I'd go with Mike on that front anyway.

    You can get the conditions into the FROM statement but again - not sure of the advantage - suck and see?

    Code:
    SELECT tblDt.Dt, Gx.GID, tblHydroProfiles.Capacity, Gx.Cost
    FROM (SELECT GID, Cost, ProfileID FROM tblGx WHERE tblGx.ResID=1 Or tblGx.ResID=8) AS Gx INNER JOIN (tblDt INNER JOIN tblHydroProfiles ON (tblDt.Hour=tblHydroProfiles.Hour) AND (tblDt.DOW=tblHydroProfiles.DOW) AND (tblDt.Month=tblHydroProfiles.Month)) ON Gx.ProfileID=tblHydroProfiles.HydroProfileID
    WHERE Gx.InServiceDate <=tblDt!Dt And Gx.OutServiceDate >tblDt!Dt And tblDt.Dt >=#1/1/2007# And tblDt.Dt <=#1/1/2008# ORDER BY Dt, GID
    Air code though I'm afraid
    Testimonial:
    pootle flump
    ur codings are working excelent.

  6. #6
    Join Date
    May 2006
    Posts
    35
    Quote Originally Posted by pootle flump
    Um... not sure about the your second post but your query looks wrong to me (correct me if I'm wrong - damn Accesses insistence on obsuctory brackets).
    The code I posted worked... albeit very slowly.


    I was originally trying something similar to your posted code, but ultimately decided to stick with unique ProfileIDs relative to the ResIDs.

    My final result is this:

    Code:
    SELECT DEV.Dt AS Dt, tblGx.GID AS GID, IIf(tblOutages!GID Is Null,Installed_Capacity,0) AS Actual_Capacity, tblGx.Cost AS Cost
    FROM [
    
    SELECT tblDt.Dt, tblGx.GID, tblGx.Installed_Capacity,  tblGx.Cost FROM tblDt, tblGx 
    WHERE Dt >=#1/1/2007# And Dt <#1/1/2008# AND tblGx.ResID <> 1 AND tblGx.ResID <> 8 AND  tblGx.ResID <> 2
    UNION SELECT tblDt.Dt, tblGx.GID, tblHydroProfiles.Capacity, tblGx.Cost
    FROM tblGx INNER JOIN (tblDt INNER JOIN tblHydroProfiles ON (tblDt.Hour=tblHydroProfiles.Hour) AND (tblDt.DOW=tblHydroProfiles.DOW) AND (tblDt.Month=tblHydroProfiles.Month)) ON tblGx.ProfileID=tblHydroProfiles.HydroProfileID
    WHERE (((tblGx.InServiceDate)<=tblDt!Dt) And ((tblGx.OutServiceDate)>tblDt!Dt)) AND (tblDt.Dt)>=#1/1/2007# And (tblDt.Dt)<=#1/1/2008#
    UNION SELECT
    tblDt.Dt, tblGx.GID, tblGx.Installed_Capacity*tblWindProfile.CapFac, tblGx.Cost 
    FROM tblDt INNER JOIN (tblGx INNER JOIN tblWindProfile ON tblGx.ProfileID=tblWindProfile.WindProfileID) ON (tblDt.Month=tblWindProfile.Month) AND (tblDt.Hour=tblWindProfile.Hour)
    WHERE (((tblGx.InServiceDate)<=tblDt!Dt) And ((tblGx.OutServiceDate)>tblDt!Dt)) AND
    (tblDt.Dt)>=#1/1/2007# And (tblDt.Dt)<=#1/1/2008#
    ]. AS DEV LEFT JOIN tblOutages ON (DEV.GID = tblOutages.GID) AND (DEV.Dt >= tblOutages.StartDate) AND (DEV.Dt <= tblOutages.EndDate)
    ORDER BY Dt, Cost;
    I will add the BETWEEN statements M Owen suggested next, but it is still running far too slow ( 40 secs +, 1.3 million records result)

    Pootle, since you know what I am trying to accomplish (I hope), is this basically the best speed I can achieve using "set theory" (I still have to run the result through my program to "dispatch" every hour until the hourly loading requirement is met... and then perform more calcs...)

  7. #7
    Join Date
    Sep 2003
    Location
    MI
    Posts
    3,713
    Quote Originally Posted by pootle flump
    I don't know if BETWEEN is actually an optimisation - I know for a fact that SQL Server does not differentiate the two. Dunno about JET - totally different engine. It is neater code though so I'd go with Mike on that front anyway.
    BETWEEN is optimized in Jet ... There may not be any statistical difference between it and 2 discrete end tests for performance. I use it so that I don't have to evaluate additional conditions (plus it eliminates the "forget" factor - Oops I forgot to test for the end condition) ...
    Back to Access ... ADO is not the way to go for speed ...

  8. #8
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    Thanks Mike

    Nai - this is giving the correct result? So we are just optimising? Fancy posting your db again - I don't have all those tables
    Testimonial:
    pootle flump
    ur codings are working excelent.

  9. #9
    Join Date
    May 2006
    Posts
    35
    Pootle,

    The current sample DB that I am working with (with less dates and less tables), is still ~2mb zipped. Due to its size, I cannot upload it onto the server despite the 10mb cap. Any suggestions (or should I cut up the DB further by removing some profiles?)

  10. #10
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    You could just post any tables (containing data) that were not in the original file. I already have populated tblDt, tblExisting and tblOutages so just remove the data from these. Compact and repair too.

    HTH
    Testimonial:
    pootle flump
    ur codings are working excelent.

  11. #11
    Join Date
    May 2006
    Posts
    35
    Pootle,

    Attached are two MDBs, one contains all of the previous tables (truncated), some of which have been renamed, and empty tables that have been left in due to their relationships..

    The empty tables have been uploaded separately for population into the first DB.
    Attached Files Attached Files

  12. #12
    Join Date
    May 2006
    Posts
    35
    As an update, the query time alone is taking is taking approx 70 seconds using the following codeblock in VBA:

    Code:
        Dim lIndexGx As Long
        Dim arrGx
        sSQL = "SELECT DEV.Dt AS Dt, tblGx.GID AS GID, IIf(tblOutages!GID Is Null,Installed_Capacity,0) AS Actual_Capacity, tblGx.ZoneID AS ZoneID " & _
                "FROM (SELECT tblDt.Dt, tblGx.GID, tblGx!Installed_Capacity*(1-tblGx![EFOR(%)]/100) AS Installed_Capacity,  tblGx.Cost, tblGx.ZoneID FROM tblDt, tblGx " & _
                "WHERE Dt BETWEEN #" & sStartDate & "# And #" & sEndDate & "# AND tblGx.ResID <> 1 AND tblGx.ResID <> 8 AND  tblGx.ResID <> 2 " & _
                "UNION SELECT tblDt.Dt, tblGx.GID, tblHydroProfiles.Capacity, tblGx.Cost, tblGx.ZoneID " & _
                "FROM tblGx INNER JOIN (tblDt INNER JOIN tblHydroProfiles ON (tblDt.Hour=tblHydroProfiles.Hour) AND (tblDt.DOW=tblHydroProfiles.DOW) AND (tblDt.Month=tblHydroProfiles.Month)) ON tblGx.ProfileID=tblHydroProfiles.HydroProfileID " & _
                "WHERE (((tblGx.InServiceDate)<=tblDt!Dt) And ((tblGx.OutServiceDate)>tblDt!Dt)) AND tblDt.Dt BETWEEN #" & sStartDate & "# And #" & sEndDate & "# " & _
                "UNION SELECT " & _
                "tblDt.Dt , tblGx.GID, tblGx.Installed_Capacity * tblWindProfile.CapFac, tblGx.Cost, tblGx.ZoneID " & _
                "FROM tblDt INNER JOIN (tblGx INNER JOIN tblWindProfile ON tblGx.ProfileID=tblWindProfile.WindProfileID) ON (tblDt.Month=tblWindProfile.Month) AND (tblDt.Hour=tblWindProfile.Hour) " & _
                "WHERE (((tblGx.InServiceDate)<=tblDt!Dt) And ((tblGx.OutServiceDate)>tblDt!Dt)) AND " & _
                "tblDt.Dt BETWEEN #" & sStartDate & "# And #" & sEndDate & "# " & _
                ") AS DEV LEFT JOIN tblOutages ON (DEV.Dt <= tblOutages.EndDate) AND (DEV.Dt >= tblOutages.StartDate) AND (DEV.GID = tblOutages.GID) " & _
                "ORDER BY Dt, Cost;"
    
        adoRs.Open Source:=sSQL, _
            ActiveConnection:=adoConn
        arrGx = adoRs.GetRows
        lIndexGx = 0
        adoRs.Close
        Set adoRs = Nothing
    The total program running time is up from ~70 secs to ~90 secs (although before I was using a bunch of excel spreadsheets that were hard to manage, and now the data seems much more manageable (using update/append/delete queries)) The memory usage has also gone up from ~60MBs to ~100MBs (due to the code above).

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •