Results 1 to 4 of 4
  1. #1
    Join Date
    Feb 2007
    Posts
    348

    Designing queries: baby steps or one giant leap

    I have thought of this question several different ways so I hope I can express it in a way others can interpret. If I am gathering data and I need to tie together several tables and I need to do some calculations, is it better to have one big query with all that or to write queries that build on each other, calling each other. Now I realize this treads on the subquery V join thread below and I have read that but either a) I too dense to understand the outcome or b) no real resolution was reached. So I thought I would give it another shot.

    So, an example: (I realized as I wrote this that it looks a lot like a homework problem but I swear it is not)
    CutomerTable:
    CustomerNum, CompanyName, Contact, AssignedSalesPerson

    Employee:
    EmpNum, LName, FName, HomePhone, WorkPhone, HireDate, EmergNum, Department

    Product:
    ProdNum, Title, WholeSalePrice, RetailPrice

    Sales:
    Customer,SalesPerson,Product,Date,Price,Paid

    So you want to show Which Department made the most profit per quarter for a period of two years. So you have to tie all four of the tables together to get the data you need above, specifically:
    Employee.Department, Sales.Date, Sales.Price, Sales.Paid
    In order to do that you have to link through three of the four above tables (I guess I didn't think my example through as well as I thought) and you need to do two more fields, one for quarter and one for profit. We could make it more fun and like my real life situation and say that the product table is older than all the other so you also have to do a conversion there too, for a third calculated field.

    So the real quesion is.....In terms of trying to get this to run fairly quickly (because you are expecting millions of hits and would like to be able to run this in an 8 hour day) is it better to build a bunch of queries that call each other, for example:
    Query 1 - links department to sales
    Query 2 - links sales to product
    Query 3 - calculates the profit per item
    Query 4 - comes up with a field to help sales and profit communicate because the sales number has changed in one but not the other
    Query 4 - links department in sales to all sales for that department
    Query 5 - links profit to all the sales and departments
    Query 6 - calculates quarter and has departments and profits in it
    then you run a report to group everything and off it goes

    or is it better to run one or two massive queries that do all the linking, grouping and calculatings?

  2. #2
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    In this particular case, the second option is MUCH preferable.
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

  3. #3
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    is this microsoft access? because if it is, i have a few years of working with large sets of data in microsoft access, and i can tell you from hard experience that the multiple-query approach is by far the better way to approach it, rather than the single query

    especially if, as is so often the case, you need to produce some totals from your intermediate results and get management's signoff ("yes, these are the correct profit numbers for those brands")

    the jet optimizer sucks a big lemon
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  4. #4
    Join Date
    Feb 2007
    Posts
    348
    The question is somewhat hypothetical but for the real work component I have tried using Crystal and Access. The actual database is a much larger database that is elsewhere on the campus and I really don't know exactly how it is stored. It seems I can point essentially any database reporting tool at it, which is nice.
    I have been working on it since and have had more luck with the smaller queries approach which seems backwards to me but perhaps I need to review the order of execution in queries so I can understand it.
    The biggest frustration so far is that I have not been able to run all the data at any time yet. It's simple too much and when I try to let it run overnight it seems to reset at a certain point in the night.
    Sadly, there is no manager signoff which is disconcerting. I am running the numbers for a much higher up who I don't know and who doesn't know me. I can give them about any number and as long as it's in the ballpark, they'll assume it correct.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •