# Thread: calculate query and subquery

1. Registered User
Join Date
Mar 2011
Location
Sydney, Australia
Posts
58

## Unanswered: calculate query and subquery

I have three tables, horse, entry and prize.
I need to find the horses which have won above average total winnings.
My query to find the average is:

select h.horse_id, h.horse_name, sum(p.money) as career_winnings
from horse H join entry e
on h.horse_id = e.horse_id
join prize p on e.place = p.place
where h.horse_name not in ('unknown dam' , 'unknown sire')
group by h.horse_id;

This works.
I now need to create the outer query but don't know how to start, particularly as the average winnings needs to include the horses who didn't win anything. My tables:

Prize
Event_id Place Money
101 1 120
101 2 60
101 3 30
102 1 10
102 2 5
102 3 2
103 1 100
103 2 60
103 3 40
401 1 1000

Horse
Horse_id Horse_Name
101 Flash
102 Star
201 Boxer
301 Daisy
401 Snowy
501 Bluebell
502 Sally
9998 Unknown dam
9999 Unknown sire

Entry
Event_id Horse_id Place
101 101 1
101 102 2
101 201 3
101 301 4
102 201 2
103 201 3
201 101 1
301 301 2
401 102 7

2. Registered User
Join Date
Sep 2009
Location
San Sebastian, Spain
Posts
880
Firstly, instead of using the SUM, use AVG as this returns the average. Also have a look at this join "join prize p on e.place = p.place". Are you sure that this is what you should be joining on?

Also when using the AVG function if a NULL is passed in it does not include this as part of the calculation. You may need to look at using an IF statement to convert the NULL to a 0.

3. Registered User
Join Date
Mar 2011
Location
Sydney, Australia
Posts
58
Thanks, Ronan, for your comment about the join; it took me a little while to work it out but I can see your point: if I just join on place, that doesn't work because there are multiples of each place, so I need to qualify that link. I need to join it on place and event_id, don't I? so this is what I've changed it to the version below, which gives the correct totals.

select h.horse_id, h.horse_name,
SUM(p.money) as career_winnings
from horse H join entry e
on h.horse_id = e.horse_id
join prize p on e.place = p.place
and e.event_id = p.event_id
where h.horse_name not in ('unknown dam' , 'unknown sire')
group by h.horse_id;

My reason for using SUM was that in this sub-query, I need to first come up with each horse's total career winnings. When I have the total career winnings for each horse, I assume I will need an outer query to get the average winings from this result list of winnings, including the non-winners in the calculation, (which will be around 255.00/7 = 36.43) and then find which horses achieved above that figure (which happens to be all of them).

Do you still think I should use AVG() here?
I'm still not sure how to go about structuring the outer query: should the IF clause go in the outer query?

4. Registered User
Join Date
Dec 2007
Location
Richmond, VA
Posts
1,328
Here is something that should get you on track. Completely untested.
Code:
```select x.horse_id
,  x.horse_name
,  x.career_winnings
from (select money.total_money/hors.total_horse as avg_money
from (select count(*) as total_horse from horse) as hors
,(select sum(money) as total_money from prize) as money) as sum_avg
,(select h.horse_id
, h.horse_name
, SUM(p.money) as career_winnings
from horse H
join entry e
on h.horse_id = e.horse_id
join prize p
on e.place    = p.place
and e.event_id = p.event_id
where h.horse_name not in ('unknown dam' , 'unknown sire')
group by h.horse_id) as x
where x.career_winnings > sum_avg.avg_money```
Dave Nance

5. Registered User
Join Date
Mar 2011
Location
Sydney, Australia
Posts
58
Thank you, Dave.
This looks close but returns an empty set.
I'm thinking that it is saying that there is no horse which has earned above the average career winnings.
As this can't be true, is the error caused by the statement which calls for the average calculation be based on total money in the prize table?

(SELECT SUM(money) AS total_money FROM prize) AS money) AS sum_avg

The total of the prize table involves many more events and prizes than those in the entry table. As it's only the horses in the entry table who won money, to give the correct answer, shouldn't this statement be based on the total of the career_winnings column in my subquery and divided by the total number of horses?
There are 3 horses with career_winnings:
101, 102 and 201, with winnings totalling 255.00
How can I adjust the statement to use the total of the career_winnings subquery, rather than the total of the prize table?

6. Registered User
Join Date
Dec 2007
Location
Richmond, VA
Posts
1,328
In that case, you would need to add the horse table into that subselect and nest it a little further to get the avg winnings.
Something like(again, just off top of my head you):

7. Registered User
Join Date
Mar 2011
Location
Sydney, Australia
Posts
58
Thanks, Davo, but the rest of your message didn't come through. Could you try again, please?

8. Registered User
Join Date
Mar 2011
Location
Sydney, Australia
Posts
58