# Thread: How to find the median?

1. Registered User
Join Date
Dec 2005
Location
Texas
Posts
100

## Unanswered: How to find the median?

I've noticed that SQL Server (and other DBMSs I've looked at) doesn't seem to have a built-in function for finding the median of a range of numbers.

Gack!

2. SQL Consultant
Join Date
Apr 2002
Location
Posts
20,002
gack is right

see this thread from way back in the archives

3. King of Understatement
Join Date
Feb 2004
Location
One Flump in One Place
Posts
14,912
Your required solution may not be quite as complicated as the one linked to. That particular poster needed the median of some not-yet-calculated-values so these values are calculated in derived tables.

If you simply have a table with several values that you want to find the median of then the SQL is simpler.

4. Resident Curmudgeon
Join Date
Feb 2004
Location
In front of the computer
Posts
15,579
Median does not have a set based solution. You have to iterate in order to find the median (order matters to compute a median). Engines that process a row at a time do medians easily, engines that process sets have a real problem with it.

-PatP

5. SQL Consultant
Join Date
Apr 2002
Location
Posts
20,002
okay, which is it? "does not have a set based solution" or "have a real problem with it"?

Code:
```SELECT
CASE WHEN COUNT(*)%2=1
THEN x.Hours
ELSE (x.Hours+MIN(CASE WHEN y.Hours>x.Hours
THEN y.Hours
END))/2.0
END median
FROM BulbLife x, BulbLife y
GROUP BY x.Hours
HAVING
SUM(CASE WHEN y.Hours <= x.Hours
THEN 1 ELSE 0 END)>=(count(*)+1)/2 AND
SUM(CASE WHEN y.Hours >= x.Hours
THEN 1 ELSE 0 END)>=(count(*)/2)+1

-- from Transact-SQL Cookbook Chapter 8  Statistics in SQL```

6. King of Understatement
Join Date
Feb 2004
Location
One Flump in One Place
Posts
14,912
Or:

Code:
```SELECT A.TheValue AS TheMedian
FROM (SELECT DISTINCT TheValue FROM MyTable) A, MyTable B
GROUP BY A.TheValue
HAVING sum(SIGN(A.TheValue - B.TheValue )) IN (0, -1)```

7. King of Understatement
Join Date
Feb 2004
Location
One Flump in One Place
Posts
14,912
Beg your pardon - blummin' thing has failed.

8. Resident Curmudgeon
Join Date
Feb 2004
Location
In front of the computer
Posts
15,579
The Median depends on order in every case, and non-deterministic rows for data with an even number of elements. Because Median depends on element order, there can't be any set based solution, because sets have no order.

I never said that you can't solve it with SQL, that's a very different claim. There are several ways to do that. You still can't solve it declaratively, and you can't solve it with sets.

-PatP

9. SQL Consultant
Join Date
Apr 2002
Location
Posts
20,002
sets may have no order, but order can still be imposed with predicates

i ask you, are you familiar with the query to rank result rows? it's a theta self-join, and what it does is count the rows that are greater (or lesser) than, based on the value of a column, then adds 1 to get the rank

same thing with partitioning a single set into two -- those values that are higher, and those values that are lower

and those are sets

10. SQL Consultant
Join Date
Apr 2002
Location
Posts
20,002
Originally Posted by Pat Phelan
there can't be any set based solution
what do you call the solution in post #5, please?

11. Registered User
Join Date
Jan 2003
Location
Massachusetts
Posts
5,862
On the surface, I would call it a performance problem ;-). How many rows in BulbLife? Never knew you could declare a "join" in the having clause. Or does the optimizer see it that way?

12. SQL Consultant
Join Date
Apr 2002
Location
Posts
20,002
Originally Posted by MCrowley
On the surface, I would call it a performance problem ;-).
okay, i see the smiley, so i won't respond the way i would've if it weren't there, which would've been something along the lines of "oh, so for challenging queries, you dump the data and do it in excel, eh?"

btw, there is no join in the HAVING clause (if you are referring to post #5)

13. Resident Curmudgeon
Join Date
Feb 2004
Location
In front of the computer
Posts
15,579
Originally Posted by r937
what do you call the solution in post #5, please?
I'd call that a Transact-SQL solution.

-PatP

14. Window Washer
Join Date
Nov 2002
Location
Jersey
Posts
10,322
And here I thought you came up with this on your own...

Looks like a must have book though...

15. SQL Consultant
Join Date
Apr 2002
Location
Posts
20,002
Originally Posted by Pat Phelan
I'd call that a Transact-SQL solution.
you are a slippery eel today, but squirm all you want, buddy, i think i gotcha

what part of it makes it Transact-SQL? MIN? CASE? COUNT?

looks to me like it's all SQL

or are you referring to the fact that the solution doesn't use FLOOR?

and what about my questions regarding sets?

is that or is that not a set-based solution?

eel, i say

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•