Results 1 to 11 of 11
  1. #1
    Join Date
    Oct 2006
    Posts
    9

    Question Unanswered: Max() on JOIN subquery with extra columns

    Locations : Events = 1 : many
    Locations 381 rows
    Events 397 rows

    I am trying to join the Locations and Events tables in order to get the most recent maintenance event for each location. The only differences below exist in the outer portion of the query with the Max() function (the "sub" is identical in each). The first query returns all the correct records that I want (the most recent event or a null event for each location in division 11, 381 rows). However, it does not give me the event_id that is associated with the event cmpltd_dt so I have no way of tying the location to the event. The second query adds the maint_evnt_id to the query but doing so breaks the aggregate function results we had correct in the first query.

    Code:
    SELECT  sub.loc_id, Max(sub.cmpltd_dt) as MAXcmpltd_dt
    FROM 
    (
    SELECT locations.loc_id, events.event_id, events.cmpltd_dt 
    FROM locations 
    LEFT JOIN (SELECT * FROM events WHERE event_cd=701 AND div_nbr=11) events 
    ON locations.loc_id = events.loc_id 
    WHERE locations.div_nbr=11 
    ORDER BY events.cmpltd_dt ASC, locations.loc_id ASC
    ) 
    sub 
    GROUP BY sub.loc_id
    Code:
    SELECT  sub.loc_id, Max(sub.cmpltd_dt) as MAXcmpltd_dt, sub.event_id
    FROM 
    (
    SELECT locations.loc_id, events.event_id, events.cmpltd_dt 
    FROM locations 
    LEFT JOIN (SELECT * FROM events WHERE event_cd=701 AND div_nbr=11) events 
    ON locations.loc_id = events.loc_id 
    WHERE locations.div_nbr=11 
    ORDER BY events.cmpltd_dt ASC, locations.loc_id ASC
    ) 
    sub 
    GROUP BY sub.loc_id , sub.event_id
    Anyone able to help will join r937 on the Whiteboard of Fame in conference room C-011.

  2. #2
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    Code:
    SELECT l.loc_id, e.event_id, e.cmpltd_dt
       FROM locations AS l
       JOIN events AS e
          ON (e.loc_id = l.loc_id)
       WHERE  11 = l.div_nbr
          AND e.cmpltd_dt = (SELECT Max(z.cmpltd_dt)
             FROM events AS z
             WHERE  z.loc_id = l.loc_id)
       ORDER BY e.cmpltd_dt ASC, l.loc_id ASC
    -PatP

  3. #3
    Join Date
    Oct 2006
    Posts
    9
    Unfortunately, no dice.
    The date field is not unique enough to pull from the event table (some events have the same date).
    Also, this doesn't return a row for locations with no event.

    So far, the closest I can get is:
    Code:
    SELECT  sub.loc_id, Max(sub.cmpltd_dt) as MAXcmpltd_dt, Max(sub.event_id) as event_id
    FROM 
    (
    SELECT locations.loc_id, events.event_id, events.cmpltd_dt 
    FROM locations 
    LEFT JOIN (SELECT * FROM events WHERE event_cd=701 AND div_nbr=11) events 
    ON locations.loc_id = events.loc_id 
    WHERE locations.div_nbr=11 
    ORDER BY events.cmpltd_dt ASC, locations.loc_id ASC
    ) 
    sub 
    GROUP BY sub.loc_id
    This works 99% of the time because the event_id is a sort of a timestamp-derived key and 99% of the time the event is created at a time that matches the cmpltd_dt field so it sorts the same, but there are situations that break it so it is no good.

  4. #4
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    Code:
    SELECT l.loc_id, e.event_id, e.cmpltd_dt
       FROM locations AS l
       LEFT JOIN (SELECT *
          FROM events AS z1
          WHERE  701 = z1.event_code
             AND z1.cmpltd_dt = (SELECT Max(z.cmpltd_dt)
                FROM events AS z2
                WHERE  z2.loc_id = z1.loc_id
                   AND 701 = z2.event_code)) AS e
          ON (e.loc_id = l.loc_id)
       WHERE  11 = l.div_nbr
       ORDER BY e.cmpltd_dt ASC, l.loc_id ASC
    I take it that these aren't all maintenance events then, and that 701 indicates a maintenance event.

    -PatP

  5. #5
    Join Date
    Oct 2006
    Posts
    9
    Yeah, sorry about that. Must have lopped it off when trying to make the SQL readable. Of the 381 Locations, 164 don't have a maintenance event so should return null in the join. The rest of the locations of 1-3 maintenance events that I am trying to get the latest of with Max().

  6. #6
    Join Date
    Oct 2006
    Posts
    9

    Simplified

    Maybe this will be easier... Here is a representation of the data that I have. All Loc's are represented and some will have events associated with them but some will not. I want to query this data and return 1 row for each Loc (regardless of events), and if it has an event, give me the latest one (if it doesn't give me a null). Like this...
    Code:
    loc_id	event_dt	event_id
    100	7/9/2008	1
    100	7/15/2008	2
    101	null		null
    102	7/9/2008	3
    103	7/9/2008	4
    103	7/22/2008	5
    104	7/9/2008	6
    104	7/13/2008	7
    104	7/28/2008	8
    105	null		null
    106	7/9/2008	9
    ** magical query stuff happens **

    Code:
    loc_id	event_dt	event_id
    100	7/15/2008	2
    101	null		null
    102	7/9/2008	3
    103	7/22/2008	5
    104	7/28/2008	8
    105	null		null
    106	7/9/2008	9
    We have one row for each Loc and either a null record or the latest event. Max() doesn't do it because it factors the event_id. Any ideas?

  7. #7
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    So, did my SQL from post #4 solve the problem, or does it still need more work? I think that it is complete and correct, but you've posted twice since then so I'm not sure.

    -PatP

  8. #8
    Join Date
    Oct 2006
    Posts
    9
    Wow, I am an idiot. I thought you were quoting my snippet from earlier.

    Just ran your query and it is almost dead on. I get 386 rows instead of the 381 because there are five "701" events done the same day as another at the same location. Even if I can't get these double-ups removed your SQL definitely seems to give me correct results. Do you see where I can through in a DISTINCT or something to be sure I get just the 381?

  9. #9
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    Yes, DISTINCT will do that.

    Just as an observation, you REALLY need to think about normalizing that bundle of fun. There are a few "issues" in what you've posted so far, and I think what I can see from here is only the tip of the iceberg... Guaranteed, no doubt in my mind, it is only a matter of time before it comes to bite you.

    -PatP

  10. #10
    Join Date
    Oct 2006
    Posts
    9
    Is it not normalized?

    Location table
    loc_id(pk) a_bunch_of_location_properties…

    Events table
    loc_id(fk) event_id(pk) event_dt event_cd(fk to code table) other_event_properties…

    The only think that I thought breaks the normalization was that I put a copy of the location division number in the event table because for processing just events it became a lot of trouble always having to join locations. In fact, originally to get an event division number you had to join it to the locations table, then locations to the city table, and city to the county table that determines the division... seemed like a lot of overhead to simply give me all the locations in a specific division (which is what all my queries require since each user is division specific).

    I really appreciate your help. The DBA's here refuse to look at SQL and our developers are engineering specific with hardly any DB experience so we get lost quite easily.

  11. #11
    Join Date
    Jan 2007
    Location
    Jena, Germany
    Posts
    2,721
    First you should try to get a clean design in 3NF, i.e. avoid duplicated information as much as possible. Once you are there, you can tune your system to give you the desired performance. Denormalizing things is a valid approach during performance tuning, but it should be the last resort because it comes with a lot of problems and impact on your application design.
    Knut Stolze
    IBM DB2 Analytics Accelerator
    IBM Germany Research & Development

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •