Results 1 to 13 of 13
  1. #1
    Join Date
    Sep 2009
    Posts
    4

    Unanswered: Count to % question

    I'm working with survey data. Essentially, I want to count total responses for a question for a location, then use that total to create what percentage of all responses each particular response is. I then want to group results by location.

    An ideal output would be similar to this:

    Q1 | State | City | %
    yes| MA | bos |10
    no | MA | bos |40
    m. | MA | bos |50
    yes| MA | cam |20
    no | MA | cam |20
    m. | MA | cam |80

    The problem I run into (I believe) is that GROUP BY works before my count statement, so I can't count all the responses. Below is an example of what I have to produce real numbers:

    SELECT q1, state, city, COUNT(q1) FROM master GROUP BY state, city, q1


    not all questions have responses, so below is my attempt to get %:

    SELECT q1, state, city, count(q1)/(count(nullif(q1,0))) as percent FROM master group by state, city, q1


    I believe using WITH or OVER(PARTITION BY...) would be a possible avenue, but I can't get either to work. Any help or direction would be much appreciated.

  2. #2
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    Quote Originally Posted by petschek
    Any help or direction would be much appreciated.
    Something like this should work:
    Code:
    SELECT q1, state, city, 
            ( count(*)*100 ) / ( select count(*) from master ) as percent 
    FROM master 
    group by state, city, q1
    PS master isn't a very good table name and q1 implies a question rather than an answer to a question.

    Mike

  3. #3
    Join Date
    Sep 2009
    Posts
    4
    Thanks Mike. However, that doesn't give me % by location. That gives me a % against the whole. My understanding is that GROUP BY is done first, so if I have that, perhaps somehow one could 'count (all groups)', or something to that effect?

  4. #4
    Join Date
    Sep 2009
    Posts
    4
    ...i also simplified the names of fields/tables, but point well taken.

    nick

  5. #5
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    Quote Originally Posted by petschek
    Thanks Mike. However, that doesn't give me % by location. That gives me a % against the whole. My understanding is that GROUP BY is done first, so if I have that, perhaps somehow one could 'count (all groups)', or something to that effect?
    It should do what I thought you wanted (if that makes sense). I've now looked more closely at your percentage column and realise it adds up to 220. Perhaps my understanding of percentages is old fashioned but I thought percentages added up to 100?

  6. #6
    Join Date
    Sep 2009
    Posts
    4
    haha. good point. the WHOLE col should be 200 (change the last 20s to 10s). What I am trying to do is get % for the city CAM and % for the city BOS. I have over 60 locations, so my output (if I can get it to work) will be 6,000+ for the whole col, but 100% per each location.

    Hope that clears up what I mean.

  7. #7
    Join Date
    Sep 2009
    Posts
    64
    If I understand correctly, what you want is percentage within the city.

    So what you need to do is to get the total count per city :
    Code:
    SELECT state, city, COUNT(*) AS total FROM master GROUP BY 1, 2
    Then you get the count for each Q1, which you already have
    Code:
    SELECT q1, state, city, COUNT(*) AS cnt FROM master GROUP BY 1, 2, 3
    Then what you want to do is you want to join the 2 results so that you can get percentages.
    Code:
    SELECT
    	q1_totals.q1, q1_totals.state, q1_totals.city, (q1_totals.cnt / city_totals.total) * 100 AS percent
    FROM
    	(SELECT q1, state, city, COUNT(*) AS cnt FROM master GROUP BY 1, 2, 3) AS q1_totals
    	JOIN (SELECT state, city, COUNT(*) AS total FROM master GROUP BY 1, 2) AS city_totals ON q1_totals.state = city_totals.state
    		AND q1_totals.city = city_totals.city

  8. #8
    Join Date
    Sep 2009
    Posts
    3
    why don't you just find percent by taking the # of results you want and dividing that # by the total # of items in the table and multiplying it by 100.

    that will give you the right %.

  9. #9
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    Quote Originally Posted by petschek
    What I am trying to do is get % for the city CAM and % for the city BOS
    This is probably the same as mnirwan's but I haven't checked.
    Code:
    SELECT q1, state, city, 
            ( count(*)*100 ) / 
                 ( select count(*) from master m2
                   where  m2.state = m1.state 
                          and m2.city = m1.city ) 
            as percent 
    FROM master m1
    group by m1.state, m1.city, m1.q1
    It seems silly to me to call the final column percent when anyone who looks at it is going to think "that's odd - it's more than 100". I'd suggest separate reports for each city or just renaming the column.

    Mike

  10. #10
    Join Date
    Sep 2009
    Posts
    64
    Hey Mike ...

    Just wondering ... performance wise, my solution would have 3 selects total (I think). While yours, MySQL would be forced to run the sub query per row found ... Logically, I think that's what'll happen ... not sure if it's true.

    It's probably similar to my blog post about a particular optimizing query.

  11. #11
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    try this --
    Code:
    SELECT t2.q1
         , t2.state
         , t2.city
         , 100.0 * t2.c / t1.c AS pct
      FROM ( SELECT state         
                  , city         
                  , COUNT(*) AS c
               FROM master
             GROUP
                 BY state         
                  , city ) AS t1
    INNER
      JOIN ( SELECT q1         
                  , state         
                  , city         
                  , COUNT(*) AS c
               FROM master
             GROUP
                 BY q1         
                  , state         
                  , city ) AS t2
        ON t2.state = t1.state
       AND t2.city  = t1.city
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  12. #12
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    Quote Originally Posted by mnirwan
    Hey Mike ...

    Just wondering ... performance wise, my solution would have 3 selects total (I think). While yours, MySQL would be forced to run the sub query per row found ... Logically, I think that's what'll happen ... not sure if it's true.

    It's probably similar to my blog post about a particular optimizing query.
    I'm certain you're query is faster than mine and I've no doubt Rudy's is faster still but I'm guessing that all the queries will run in under a second on the data involved so I figured it wasn't worth the effort of rewriting the SQL unless I knew there were performance issues. It would be interesting though to get some times for the different queries though.

  13. #13
    Join Date
    Sep 2009
    Posts
    64
    Quote Originally Posted by mike_bike_kite
    I'm certain you're query is faster than mine and I've no doubt Rudy's is faster still but I'm guessing that all the queries will run in under a second on the data involved so I figured it wasn't worth the effort of rewriting the SQL unless I knew there were performance issues. It would be interesting though to get some times for the different queries though.
    Other than using INNER JOIN, I think Rudy's is exactly the same as mine ???

    And off course, it's all depends on the data. If you have only a handful, then probably not worth spending energy to optimize it. However, if you have huge data, I've seen queries time cut from 10 minutes to 24 seconds simply by having MySQL not run sub query on each rows.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •