# Thread: Count to % question

1. Registered User
Join Date
Sep 2009
Posts
4

## Unanswered: Count to % question

I'm working with survey data. Essentially, I want to count total responses for a question for a location, then use that total to create what percentage of all responses each particular response is. I then want to group results by location.

An ideal output would be similar to this:

Q1 | State | City | %
yes| MA | bos |10
no | MA | bos |40
m. | MA | bos |50
yes| MA | cam |20
no | MA | cam |20
m. | MA | cam |80

The problem I run into (I believe) is that GROUP BY works before my count statement, so I can't count all the responses. Below is an example of what I have to produce real numbers:

SELECT q1, state, city, COUNT(q1) FROM master GROUP BY state, city, q1

not all questions have responses, so below is my attempt to get %:

SELECT q1, state, city, count(q1)/(count(nullif(q1,0))) as percent FROM master group by state, city, q1

I believe using WITH or OVER(PARTITION BY...) would be a possible avenue, but I can't get either to work. Any help or direction would be much appreciated.

2. vaguely human
Join Date
Jun 2007
Location
London
Posts
2,527
Originally Posted by petschek
Any help or direction would be much appreciated.
Something like this should work:
Code:
```SELECT q1, state, city,
( count(*)*100 ) / ( select count(*) from master ) as percent
FROM master
group by state, city, q1```
PS master isn't a very good table name and q1 implies a question rather than an answer to a question.

Mike

3. Registered User
Join Date
Sep 2009
Posts
4
Thanks Mike. However, that doesn't give me % by location. That gives me a % against the whole. My understanding is that GROUP BY is done first, so if I have that, perhaps somehow one could 'count (all groups)', or something to that effect?

4. Registered User
Join Date
Sep 2009
Posts
4
...i also simplified the names of fields/tables, but point well taken.

nick

5. vaguely human
Join Date
Jun 2007
Location
London
Posts
2,527
Originally Posted by petschek
Thanks Mike. However, that doesn't give me % by location. That gives me a % against the whole. My understanding is that GROUP BY is done first, so if I have that, perhaps somehow one could 'count (all groups)', or something to that effect?
It should do what I thought you wanted (if that makes sense). I've now looked more closely at your percentage column and realise it adds up to 220. Perhaps my understanding of percentages is old fashioned but I thought percentages added up to 100?

6. Registered User
Join Date
Sep 2009
Posts
4
haha. good point. the WHOLE col should be 200 (change the last 20s to 10s). What I am trying to do is get % for the city CAM and % for the city BOS. I have over 60 locations, so my output (if I can get it to work) will be 6,000+ for the whole col, but 100% per each location.

Hope that clears up what I mean.

7. Registered User
Join Date
Sep 2009
Posts
64
If I understand correctly, what you want is percentage within the city.

So what you need to do is to get the total count per city :
Code:
`SELECT state, city, COUNT(*) AS total FROM master GROUP BY 1, 2`
Then you get the count for each Q1, which you already have
Code:
`SELECT q1, state, city, COUNT(*) AS cnt FROM master GROUP BY 1, 2, 3`
Then what you want to do is you want to join the 2 results so that you can get percentages.
Code:
```SELECT
q1_totals.q1, q1_totals.state, q1_totals.city, (q1_totals.cnt / city_totals.total) * 100 AS percent
FROM
(SELECT q1, state, city, COUNT(*) AS cnt FROM master GROUP BY 1, 2, 3) AS q1_totals
JOIN (SELECT state, city, COUNT(*) AS total FROM master GROUP BY 1, 2) AS city_totals ON q1_totals.state = city_totals.state
AND q1_totals.city = city_totals.city```

8. Registered User
Join Date
Sep 2009
Posts
3
why don't you just find percent by taking the # of results you want and dividing that # by the total # of items in the table and multiplying it by 100.

that will give you the right %.

9. vaguely human
Join Date
Jun 2007
Location
London
Posts
2,527
Originally Posted by petschek
What I am trying to do is get % for the city CAM and % for the city BOS
This is probably the same as mnirwan's but I haven't checked.
Code:
```SELECT q1, state, city,
( count(*)*100 ) /
( select count(*) from master m2
where  m2.state = m1.state
and m2.city = m1.city )
as percent
FROM master m1
group by m1.state, m1.city, m1.q1```
It seems silly to me to call the final column percent when anyone who looks at it is going to think "that's odd - it's more than 100". I'd suggest separate reports for each city or just renaming the column.

Mike

10. Registered User
Join Date
Sep 2009
Posts
64
Hey Mike ...

Just wondering ... performance wise, my solution would have 3 selects total (I think). While yours, MySQL would be forced to run the sub query per row found ... Logically, I think that's what'll happen ... not sure if it's true.

It's probably similar to my blog post about a particular optimizing query.

11. SQL Consultant
Join Date
Apr 2002
Location
Posts
20,002
try this --
Code:
```SELECT t2.q1
, t2.state
, t2.city
, 100.0 * t2.c / t1.c AS pct
FROM ( SELECT state
, city
, COUNT(*) AS c
FROM master
GROUP
BY state
, city ) AS t1
INNER
JOIN ( SELECT q1
, state
, city
, COUNT(*) AS c
FROM master
GROUP
BY q1
, state
, city ) AS t2
ON t2.state = t1.state
AND t2.city  = t1.city```

12. vaguely human
Join Date
Jun 2007
Location
London
Posts
2,527
Originally Posted by mnirwan
Hey Mike ...

Just wondering ... performance wise, my solution would have 3 selects total (I think). While yours, MySQL would be forced to run the sub query per row found ... Logically, I think that's what'll happen ... not sure if it's true.

It's probably similar to my blog post about a particular optimizing query.
I'm certain you're query is faster than mine and I've no doubt Rudy's is faster still but I'm guessing that all the queries will run in under a second on the data involved so I figured it wasn't worth the effort of rewriting the SQL unless I knew there were performance issues. It would be interesting though to get some times for the different queries though.

13. Registered User
Join Date
Sep 2009
Posts
64
Originally Posted by mike_bike_kite
I'm certain you're query is faster than mine and I've no doubt Rudy's is faster still but I'm guessing that all the queries will run in under a second on the data involved so I figured it wasn't worth the effort of rewriting the SQL unless I knew there were performance issues. It would be interesting though to get some times for the different queries though.
Other than using INNER JOIN, I think Rudy's is exactly the same as mine ???

And off course, it's all depends on the data. If you have only a handful, then probably not worth spending energy to optimize it. However, if you have huge data, I've seen queries time cut from 10 minutes to 24 seconds simply by having MySQL not run sub query on each rows.

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•