Results 1 to 6 of 6

021912, 10:30 #1Registered User
 Join Date
 Feb 2012
 Posts
 6
Unanswered: SQL server, alculating median values on a column in a table
Hi there,
I need to calculate a median on a column in a table.
The code I have is:
Code:Select gender, CASE when gender = 'F' then 'Female' when gender = 'M' then 'Male' else 'Unknown' end as test, datediff(day, [admit_date], getdate()) as 'datediffcal', from [tbl_record]
It doesn't matter if the resultset only shows the median result. So if the output shows:
median
15
that's fine. Minimally, I need the median value.
Is anybody able to help me on this?
Thank you

021912, 15:16 #2Registered User
 Join Date
 Nov 2004
 Posts
 1,428
Provided Answers: 4Try this:
Code:SELECT R.gender, AVG(1.0 * datediff(day, R.admit_date, getdate())) AS medianValue FROM (SELECT gender, admit_date, ROW_NUMBER() OVER(PARTITION BY gender ORDER BY admit_date) AS rowno FROM #DaTable ) AS R INNER JOIN (SELECT gender, 1 + count(*) as N FROM #DaTable GROUP BY gender ) G ON R.gender = G.gender AND R.rowNo BETWEEN N/2 AND N/2+N%2 GROUP BY R.gender
With kind regards . . . . . SQL Server 2000/2005/2012
Wim
Grabel's Law: 2 is not equal to 3  not even for very large values of 2.
Pat Phelan's Law: 2 very definitely CAN equal 3  in at least two programming languages

021912, 15:58 #3Registered User
 Join Date
 Feb 2012
 Posts
 6
regarding the median
Hi,
Thank you.
Because I am relatively new to this, would you mind explaining to me what the
Code:ROW_NUMBER() OVER(PARTITION BY gender ORDER BY admit_date)
How does the average() function come into play here?
Can you explain why we need an inner join here?
And what doesCode:N/2 AND N/2+N%2
Sorry. I'm just really learning the basics of the basics....appreciate your time with me on this matter.

021912, 18:13 #4Registered User
 Join Date
 Nov 2004
 Posts
 1,428
Provided Answers: 4How does the average() function come into play here?Median is a statistical calculation that's commonly used for analytical purposes to determine the middle value(s) in a distribution. Median is calculated differently depending on whether the input set of values contains an odd or even number of elements. When you have an odd number of elements and you sort the values in chronological order, median is the middle value. For example, if you have this set of values {30, 10, 40} and you sort them in chronological order, the median is 30 (middle value of {10, 30, 40}). In an even number of elements (also sorted in chronological order), the median is the average of the two middle values. For example, given the set of values {30, 10, 40, 10}, the median is 20 (average of 10 and 30).And what does
N/2 AND N/2+N%2
really do? Is that % sign in there? Is that mod?
Let's see what each part of the query does:Code:(SELECT gender, admit_date, ROW_NUMBER() OVER(PARTITION BY gender ORDER BY admit_date) AS rowno FROM #DaTable ) AS R
ROW_NUMBER() OVER(PARTITION BY gender ORDER BY admit_date)
does: partition, divide, group, ... all the records by gender, and per gender value, order all the records by the admit_date and assign an incremental number to them, starting from 1.
Code:(SELECT gender, 1 + count(*) as N FROM #DaTable GROUP BY gender ) G
 Add 1 to N, so when we have an odd number of records, we end up with an even result and
R.rowNo BETWEEN N/2 AND N/2+N%2
for an even number of N, like 10, will give us
R.rowNo BETWEEN 10/2 AND 10/2+0
=
R.rowNo BETWEEN 5 AND 5
returning exactly one record, the middle one.
 Add 1 to N, so when we have an even number of records, we end up with an odd result and
R.rowNo BETWEEN N/2 AND N/2+N%2
for an odd number of N, like 11, will give us
R.rowNo BETWEEN 11/2 AND 11/2 + 1
=
R.rowNo BETWEEN 5 AND 6
returning the two middle records.
Code:SELECT R.gender, AVG(1.0 * datediff(day, R.admit_date, getdate())) AS medianValue ... INNER JOIN (... ) G ON R.gender = G.gender AND R.rowNo BETWEEN N/2 AND N/2+N%2
The median is the average value of the middle record(s).
 With one middle record, the average = the value of the middle record itself.
 With two middle records, the average of those two records.
I don't know if my explanation helped in any way or made it more confusing. Basically it just does what the definition of Median states.With kind regards . . . . . SQL Server 2000/2005/2012
Wim
Grabel's Law: 2 is not equal to 3  not even for very large values of 2.
Pat Phelan's Law: 2 very definitely CAN equal 3  in at least two programming languages

021912, 18:57 #5Resident Curmudgeon
 Join Date
 Feb 2004
 Location
 In front of the computer
 Posts
 15,579
Provided Answers: 54Purely as a side note, the Median function is provided in SQL Server 2012.
PatPIn theory, theory and practice are identical. In practice, theory and practice are unrelated.

022012, 21:40 #6Registered User
 Join Date
 Feb 2012
 Posts
 6
regarding the median
Thank you for your detailed explanation Wim..I really appreciate your answer...so detailed.
Could you explain why we need to use an inner join to join the two sets?
median is the average value of the middle records. So does the INNER JOIN essentially join the two sets of middle records based on the equal rownum condition?
Will this condition always satisfy? I am trying to picture it in my head so:
Code:R.gender = G.gender