Results 1 to 12 of 12
  1. #1
    Join Date
    Nov 2003
    Location
    Sinapore
    Posts
    187

    Unanswered: Select and count with 2 tables

    Hi,

    Greetings

    I've two tables whose outputs are as follows:

    Code:
    table1 (message_id is primary key)
    
    message_id message
    1                hello
    2                how's going?
    3                wassup?
    
    table2 (message_id is foreign key)
    
    message_id message
    1                i'm fine
    1                good day
    1                no text
    2                great!
    What I'm trying to do is select everything from table1 and have a way to know many corresponding message_ids there are in table2. In other words, I'm trying to achieve something like:
    Code:
    message_id message           replies
    1                hello                 4
    2                how's going?      1
    3                wassup              0
    I tried with a LEFT JOIN but don't seem to be getting the results I was looking for.

    Hope someone can enlighten me

    Thanks in anticipation.
    Last edited by pearl2; 03-04-04 at 23:10.

  2. #2
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    you are right, a LEFT JOIN is needed

    using your example, i will illustrate the difference between COUNT(*) and COUNT(column)
    PHP Code:
    select table1.message_id  as id
         
    table1.message     as message1
         
    table2.message     as message2
      from table1
    left outer
      join table2
        on table1
    .message_id table2.message_id

    id  message1      message2      
     1  hello         i
    'm fine
     1  hello         good day
     1  hello         no text
     2  how'
    s going?  great!
     
    3  wassup?       NULL 
    note that message 3 has no matching row in table2, but there is a row for message 3 in the result set!

    now let's take totals
    PHP Code:
    select table1.message_id  as id
         
    table1.message     as message1
         
    count(*)           as resultrows
         
    count(table2.message)   as matchingrows
      from table1
    left outer
      join table2
        on table1
    .message_id table2.message_id
    group
        by table1
    .message_id
         
    table1.message  
        
    id  message1      resultrows  matchingrows      
     1  hello                3        3
     2  how
    's going?         1        1
     3  wassup?              1        0 
    the reason you get the right answer with COUNT(column) when using a LEFT OUTER join is because COUNT(column), like all aggregate functions except COUNT(*), ignores nulls
    Last edited by r937; 03-04-04 at 10:14.
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  3. #3
    Join Date
    Nov 2003
    Location
    Sinapore
    Posts
    187
    Thank you so much, r937!

    I'll try it out right away!

  4. #4
    Join Date
    Nov 2003
    Location
    Sinapore
    Posts
    187
    Awesome!

    A zillion thanks, r937!

    Btw, what difference does 'OUTER' add to the query?

  5. #5
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    OUTER is an optional keyword
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  6. #6
    Join Date
    Nov 2003
    Location
    Sinapore
    Posts
    187
    Cool, thanks!


  7. #7
    Join Date
    Nov 2003
    Location
    Sinapore
    Posts
    187
    I hope you can help me with this related question.

    Given the same table outputs, with a LEFT JOIN as follows:
    Code:
    #sample data from table1 and table2
    
    table1 (message_id is primary key)
    
    message_id message
    1                hello
    2                how's going?
    3                wassup?
    
    table2 (message_id is foreign key)
    
    message_id message
    1                i'm fine
    1                good day
    1                no text
    2                great!
    
    # select query
    SELECT *
    FROM table1
    LEFT JOIN table2 USING (mesage_id)
    WHERE message_id=1
    I observe that for every matched message_id, the same row in table 1 is selected. In the other words, the output of the query is something like:
    Code:
    1, hello, 1, i'm fine
    1, hello, 1, good day
    1, hello, 1, no text
    If the message is large, this could result in huge memory being used. I was wondering whether it's possible for the select to return something like:
    Code:
    1, hello
    1, i'm fine
    1, good day
    1, no text
    I suppose I could do it with two independent queries as follows:
    Code:
    SELECT *
    FROM table1
    WHERE message_id=1
    
    SELECT *
    FROM table2
    WHERE message_id=1
    I hope I'm not missing something terribly obvious. Thanks for the great help you've already rendered.

  8. #8
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    do not even think about trying to "streamline" the results with sql

    this is the job of the application program

    you are right, if the original message is large, you would do the retrieval with two separate queries
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  9. #9
    Join Date
    Nov 2003
    Location
    Sinapore
    Posts
    187
    Thanks! I'll do that


  10. #10
    Join Date
    Nov 2003
    Location
    Sinapore
    Posts
    187
    Hi,

    I've added date and time to my tables and I'm now faced with another problem.

    What I'm trying to achieve is the same: Select relevant columns from table1 and have a way to know how many corresponding message_ids there are in table2. The complexity I've added is I would like to select the latest added entry from table2.
    Code:
    # sample data from table1 and table2
    table1 (message_id is primary key)
    
    message_id message         date             time
    1                hello               2004-03-04   08:51:00
    2                how's going?    2004-03-05   21:22:00
    3                wassup?          2004-03-05   23:22:00
    
    table2 (message_id is foreign key)
    
    message_id message    date             time
    1                first reply   2004-03-04   10:51:00
    1                i'm fine      2004-03-05   07:51:00
    1                good day   2004-03-05   08:51:00
    1                no text      2004-03-05   09:51:00
    2                great!        2004-03-04   20:51:00
    
    # sql query
    select table1.message_id  as id
         , table1.message     as message1
         , count(table2.message)  as matchingrows
         , table2.date as date2
         , table2.time as time2
      from table1
    left outer
      join table2
        on table1.message_id = table2.message_id
    group
        by table1.message_id
         , table1.message  
    
    # desired output
    id  message1      matchingrows  date2          time2
     1  hello              3                   2004-03-05  09:51:00<--
     2  how's going?   1                   2004-03-04   20:51:00
     3  wassup?         0
    message1 has 4 entries in table2. How do I select the latest entry (2004-03-05 09:51:00)? Currently, the entry "1 i'm fine 2004-03-05 07:51:00" is selected i.e. the oldest entry of 2004-03-05.

    I've tried adding ORDER by after GROUP BY but I'm not getting the results.

    I'm absolutely clueless on how to go about it. Any help will be appreciated

  11. #11
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    how "married" are you to the need to get the latest reply at the same time as the number of replies, all in one query?

    because that one query will be large, ungainly, and relatively inefficient

    at this point it may be easier for you to do the left outer join without grouping, use an ORDER BY to get the replies into sequence, then do the counting and selecting of latest reply in your application
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  12. #12
    Join Date
    Nov 2003
    Location
    Sinapore
    Posts
    187
    Great thanks for your advice, r937!

    Yes, I came to that realization when nothing seemed to work with just one query.

    I thought I would try with creating another table called 'latest'. Whenever a post is made, "table1" and "latest" are both inserted with the relevant data.
    Code:
    table1 (main post)
    post_id author  topic   date             time
    2          john    hey     2004-03-05   21:22:00
    3          min     hi        2004-03-05   23:10:00
    
    latest
    post_id author  date              time
    2          john     2004-03-05   21:22:00
    3          min      2004-03-05   23:10:00
    When a reply is made, "table2" is inserted with the reply data and "latest" gets updated with information on the reply:
    Code:
    table2 (replies)
    post_id author  topic       date             time
    2          jie       goodday  2004-03-05   22:17:20
    
    latest
    post_id author  date              time
    2          jie       2004-03-05    22:17:20
    The sql query is as follows:
    Code:
    SELECT
       table1.post_id,
       table1.post,
       table1.author,
       COUNT(table2.post_id),
       latest.date,
       latest.time,
       latest.author
    FROM table1
       LEFT JOIN latest USING (post_id)
       LEFT JOIN table2 USING (post_id)
       GROUP BY table1.post_id, topic
       ORDER BY latest.date DESC, latest.time DESC;
    That produes the desired results. Strangely, when the sequence of the two LEFT JOINS is swapped, the query doesn't work.

    There's a bit duplication here as another table (latest) is created with essentially the same data as either "table1" (main post) or "table2" (replies). But it was what I could think of at that point in time.

    Do you think it's bad idea to create another table just for that purpose?
    Last edited by pearl2; 03-05-04 at 11:54.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •