Results 1 to 9 of 9
  1. #1
    Join Date
    Nov 2003
    Location
    LA, CA
    Posts
    54

    Angry Unanswered: Anyone know the code for creating Medians?

    Hi there... would anyone know how to calculate Medians in a dataset?


    The median is the middle value of data (NOT THE AVERAGE). For instance, if there were 11 records, the calculation is (11 +1) /2 = 6 The 6th record (assuming they are in ascending order) is the median. For records that come out not as a whole number (e.g., 10 record's median is (10+1)/2 = 5.5) you would have to take the average of the rounded down and rounded up whole numbers. In the example of 10 records, you would take the 5 record + the 6th record/2 = extrapolated median.

    This is currently a manual process when I am analyzing salary data. Would love to automate it. Can't think of a way to do it....

    Gosh my brain hurts.

    Rachel
    Compensation Analyst

  2. #2
    Join Date
    Dec 2005
    Location
    Texas
    Posts
    100
    Are you wanting to do it in a VB application or on a DB?

  3. #3
    Join Date
    Nov 2003
    Location
    LA, CA
    Posts
    54
    Ooops... did I put this in the wrong spot? I saw some other VBA posts in here and thought it might be a good place to start. I am using Access

  4. #4
    Join Date
    Dec 2005
    Location
    Texas
    Posts
    100
    Not the wrong spot at all. I have some code in a book called 'Access Cookbook' by O'Reilly that will calculate a median, though it runs a bit slowly. I'll post it later since it takes ages to type out.

  5. #5
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    I believe this is specifically called the financial median right?

    Check out:
    http://www.oreilly.com/catalog/trans...pter/ch08.html
    do a find on Median (about 2/3 of the way down). The code is T-SQL but can easily be adapted. I posted a median calculation solution a while ago but it struggled if there were nulls in the table.

    If you check the link you will see the logic. This is the T-SQL translated:
    Code:
    SELECT 
       IIF(COUNT(*) Mod 2=1, x.Hours, x.Hours+MIN(Iif(y.Hours>x.Hours, y.Hours)/2.0)) AS median
    FROM BulbLife x, BulbLife y
    GROUP BY x.Hours
    HAVING 
       SUM(Iif(y.Hours <= x.Hours, 1, 0))>=(count(*)+1)/2 AND
       SUM(Iif(y.Hours >= x.Hours, 1, 0))>=(count(*)/2)+1
    I've just checked the soultion I previously posted - it wasn't the financial median

    EDIT - added missing )
    Last edited by pootle flump; 09-21-06 at 07:44.
    Testimonial:
    pootle flump
    ur codings are working excelent.

  6. #6
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    Quote Originally Posted by pootle flump
    The code is T-SQL but can easily be adapted.
    Well - you would think so.

    Code:
    CREATE TABLE BulbLife (Hours INTEGER)
     
    INSERT INTO BulbLife (Hours)
    VALUES (1085)
    INSERT INTO BulbLife (Hours)
    VALUES (1109)
    INSERT INTO BulbLife (Hours)
    VALUES (1093)
    INSERT INTO BulbLife (Hours)
    VALUES (1043)
    INSERT INTO BulbLife (Hours)
    VALUES (1129)
    INSERT INTO BulbLife (Hours)
    VALUES (1099)
    INSERT INTO BulbLife (Hours)
    VALUES (1057)
    INSERT INTO BulbLife (Hours)
    VALUES (1114)
    INSERT INTO BulbLife (Hours)
    VALUES (1077)
    INSERT INTO BulbLife (Hours)
    VALUES (1086)
    INSERT INTO BulbLife (Hours)
    VALUES (1103)
    INSERT INTO BulbLife (Hours)
    VALUES (1079)
    INSERT INTO BulbLife (Hours)
    VALUES (1073)
    INSERT INTO BulbLife (Hours)
    VALUES (1086)
    INSERT INTO BulbLife (Hours)
    VALUES (1131)
    INSERT INTO BulbLife (Hours)
    VALUES (1087)
    INSERT INTO BulbLife (Hours)
    VALUES (1096)
    INSERT INTO BulbLife (Hours)
    VALUES (1167)
    INSERT INTO BulbLife (Hours)
    VALUES (1043)
    INSERT INTO BulbLife (Hours)
    VALUES (1074)
    Either I have cocked this up or Access handles cartesian products in a rather cavelier way:
    Code:
    --T-SQL
    SELECT x.Hours,  
     SUM(CASE WHEN y.Hours <= x.Hours THEN 1 ELSE 0 END) AS Suma, 
     (count(*)+1)/2 AS GrtrThan, 
      SUM(CASE WHEN y.Hours >= x.Hours THEN 1 ELSE 0 END) AS Sumb, 
     (count(*)/2)+1 AS GrtrThan2
    FROM  BulbLife x, BulbLife y
    GROUP BY 
     x.Hours
    ORDER BY 
     x.Hours
     
    --Access
    SELECT x.Hours, Sum(IIf([y].[Hours]<=[x].[Hours],1,0)) AS Suma, (Count(*)+1)/2 AS GrtrThan, Sum(IIf([y].[Hours]>=[x].[Hours],1,0)) AS Sumb, (Count(*)/2)+1 AS GrtrThen2
    FROM BulbLife AS x, Bulblife AS y
    GROUP BY x.Hours
    ORDER BY x.Hours;
    Anyone fancy pointing out what I missed?
    Testimonial:
    pootle flump
    ur codings are working excelent.

  7. #7
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    Oops - wasn't testing on the same data And the T-SQL took advantage of Integer\ Integer = Integer so had to use integer division. Defo works:
    Code:
    SELECT IIf(Count(*) Mod 2=1,[x].[hours],([x].[hours]+Min(IIf([y].[hours]>[x].[hours],[y].[hours])))/2) AS Median
    FROM BulbLife AS x, Bulblife AS y
    GROUP BY x.Hours
    HAVING Sum(IIf([y].[Hours]<=[x].[Hours],1,0))>=(Count(*)+1)\2 AND Sum(IIf(y.Hours>=x.Hours,1,0))>=(Count(*)\2)+1
    Testimonial:
    pootle flump
    ur codings are working excelent.

  8. #8
    Join Date
    Nov 2003
    Location
    LA, CA
    Posts
    54
    oh dear.... now i am very confused... which code works? I want it to run off a query that doesn't have null values for people's salary.... Argh... you folks are waaaaaay more advanced than i am!!! (and thank god for that!)

    Rachel

  9. #9
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    The last one I posted works If you have no nulls then it should be fine.
    Testimonial:
    pootle flump
    ur codings are working excelent.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •