If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > MySQL > Count to % question

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 09-23-09, 11:58
petschek petschek is offline
Registered User
 
Join Date: Sep 2009
Posts: 4
Count to % question

I'm working with survey data. Essentially, I want to count total responses for a question for a location, then use that total to create what percentage of all responses each particular response is. I then want to group results by location.

An ideal output would be similar to this:

Q1 | State | City | %
yes| MA | bos |10
no | MA | bos |40
m. | MA | bos |50
yes| MA | cam |20
no | MA | cam |20
m. | MA | cam |80

The problem I run into (I believe) is that GROUP BY works before my count statement, so I can't count all the responses. Below is an example of what I have to produce real numbers:

SELECT q1, state, city, COUNT(q1) FROM master GROUP BY state, city, q1


not all questions have responses, so below is my attempt to get %:

SELECT q1, state, city, count(q1)/(count(nullif(q1,0))) as percent FROM master group by state, city, q1


I believe using WITH or OVER(PARTITION BY...) would be a possible avenue, but I can't get either to work. Any help or direction would be much appreciated.
Reply With Quote
  #2 (permalink)  
Old 09-23-09, 12:13
mike_bike_kite mike_bike_kite is offline
vaguely human
 
Join Date: Jun 2007
Location: London
Posts: 2,519
Quote:
Originally Posted by petschek
Any help or direction would be much appreciated.
Something like this should work:
Code:
SELECT q1, state, city, 
        ( count(*)*100 ) / ( select count(*) from master ) as percent 
FROM master 
group by state, city, q1
PS master isn't a very good table name and q1 implies a question rather than an answer to a question.

Mike
Reply With Quote
  #3 (permalink)  
Old 09-23-09, 14:04
petschek petschek is offline
Registered User
 
Join Date: Sep 2009
Posts: 4
Thanks Mike. However, that doesn't give me % by location. That gives me a % against the whole. My understanding is that GROUP BY is done first, so if I have that, perhaps somehow one could 'count (all groups)', or something to that effect?
Reply With Quote
  #4 (permalink)  
Old 09-23-09, 14:05
petschek petschek is offline
Registered User
 
Join Date: Sep 2009
Posts: 4
...i also simplified the names of fields/tables, but point well taken.

nick
Reply With Quote
  #5 (permalink)  
Old 09-23-09, 14:27
mike_bike_kite mike_bike_kite is offline
vaguely human
 
Join Date: Jun 2007
Location: London
Posts: 2,519
Quote:
Originally Posted by petschek
Thanks Mike. However, that doesn't give me % by location. That gives me a % against the whole. My understanding is that GROUP BY is done first, so if I have that, perhaps somehow one could 'count (all groups)', or something to that effect?
It should do what I thought you wanted (if that makes sense). I've now looked more closely at your percentage column and realise it adds up to 220. Perhaps my understanding of percentages is old fashioned but I thought percentages added up to 100?
Reply With Quote
  #6 (permalink)  
Old 09-23-09, 22:29
petschek petschek is offline
Registered User
 
Join Date: Sep 2009
Posts: 4
haha. good point. the WHOLE col should be 200 (change the last 20s to 10s). What I am trying to do is get % for the city CAM and % for the city BOS. I have over 60 locations, so my output (if I can get it to work) will be 6,000+ for the whole col, but 100% per each location.

Hope that clears up what I mean.
Reply With Quote
  #7 (permalink)  
Old 09-24-09, 03:23
mnirwan mnirwan is offline
Registered User
 
Join Date: Sep 2009
Posts: 64
If I understand correctly, what you want is percentage within the city.

So what you need to do is to get the total count per city :
Code:
SELECT state, city, COUNT(*) AS total FROM master GROUP BY 1, 2
Then you get the count for each Q1, which you already have
Code:
SELECT q1, state, city, COUNT(*) AS cnt FROM master GROUP BY 1, 2, 3
Then what you want to do is you want to join the 2 results so that you can get percentages.
Code:
SELECT
	q1_totals.q1, q1_totals.state, q1_totals.city, (q1_totals.cnt / city_totals.total) * 100 AS percent
FROM
	(SELECT q1, state, city, COUNT(*) AS cnt FROM master GROUP BY 1, 2, 3) AS q1_totals
	JOIN (SELECT state, city, COUNT(*) AS total FROM master GROUP BY 1, 2) AS city_totals ON q1_totals.state = city_totals.state
		AND q1_totals.city = city_totals.city
Reply With Quote
  #8 (permalink)  
Old 09-24-09, 03:34
froobop froobop is offline
Registered User
 
Join Date: Sep 2009
Posts: 3
why don't you just find percent by taking the # of results you want and dividing that # by the total # of items in the table and multiplying it by 100.

that will give you the right %.
Reply With Quote
  #9 (permalink)  
Old 09-24-09, 03:53
mike_bike_kite mike_bike_kite is offline
vaguely human
 
Join Date: Jun 2007
Location: London
Posts: 2,519
Quote:
Originally Posted by petschek
What I am trying to do is get % for the city CAM and % for the city BOS
This is probably the same as mnirwan's but I haven't checked.
Code:
SELECT q1, state, city, 
        ( count(*)*100 ) / 
             ( select count(*) from master m2
               where  m2.state = m1.state 
                      and m2.city = m1.city ) 
        as percent 
FROM master m1
group by m1.state, m1.city, m1.q1
It seems silly to me to call the final column percent when anyone who looks at it is going to think "that's odd - it's more than 100". I'd suggest separate reports for each city or just renaming the column.

Mike
Reply With Quote
  #10 (permalink)  
Old 09-24-09, 17:54
mnirwan mnirwan is offline
Registered User
 
Join Date: Sep 2009
Posts: 64
Hey Mike ...

Just wondering ... performance wise, my solution would have 3 selects total (I think). While yours, MySQL would be forced to run the sub query per row found ... Logically, I think that's what'll happen ... not sure if it's true.

It's probably similar to my blog post about a particular optimizing query.
Reply With Quote
  #11 (permalink)  
Old 09-24-09, 18:50
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,534
try this --
Code:
SELECT t2.q1
     , t2.state
     , t2.city
     , 100.0 * t2.c / t1.c AS pct
  FROM ( SELECT state         
              , city         
              , COUNT(*) AS c
           FROM master
         GROUP
             BY state         
              , city ) AS t1
INNER
  JOIN ( SELECT q1         
              , state         
              , city         
              , COUNT(*) AS c
           FROM master
         GROUP
             BY q1         
              , state         
              , city ) AS t2
    ON t2.state = t1.state
   AND t2.city  = t1.city
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #12 (permalink)  
Old 09-24-09, 19:10
mike_bike_kite mike_bike_kite is offline
vaguely human
 
Join Date: Jun 2007
Location: London
Posts: 2,519
Quote:
Originally Posted by mnirwan
Hey Mike ...

Just wondering ... performance wise, my solution would have 3 selects total (I think). While yours, MySQL would be forced to run the sub query per row found ... Logically, I think that's what'll happen ... not sure if it's true.

It's probably similar to my blog post about a particular optimizing query.
I'm certain you're query is faster than mine and I've no doubt Rudy's is faster still but I'm guessing that all the queries will run in under a second on the data involved so I figured it wasn't worth the effort of rewriting the SQL unless I knew there were performance issues. It would be interesting though to get some times for the different queries though.
Reply With Quote
  #13 (permalink)  
Old 09-25-09, 04:50
mnirwan mnirwan is offline
Registered User
 
Join Date: Sep 2009
Posts: 64
Quote:
Originally Posted by mike_bike_kite
I'm certain you're query is faster than mine and I've no doubt Rudy's is faster still but I'm guessing that all the queries will run in under a second on the data involved so I figured it wasn't worth the effort of rewriting the SQL unless I knew there were performance issues. It would be interesting though to get some times for the different queries though.
Other than using INNER JOIN, I think Rudy's is exactly the same as mine ???

And off course, it's all depends on the data. If you have only a handful, then probably not worth spending energy to optimize it. However, if you have huge data, I've seen queries time cut from 10 minutes to 24 seconds simply by having MySQL not run sub query on each rows.
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On