If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > Microsoft SQL Server > beware of avg() for large datasets

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 10-31-06, 18:50
jezemine jezemine is offline
another indirection layer
 
Join Date: May 2004
Location: Seattle
Posts: 1,271
beware of avg() for large datasets

Not a question, just a post about something I ran into today that surprised me a little.

I have a huge dataset - about 700m rows, one column of which is a tinyint. I needed to take an average of that column, so I did this:

Code:
select avg(mytinyint) from mytable
which returned:

Code:
Msg 8115, Level 16, State 2, Line 6
Arithmetic overflow error converting expression to data type int.
It turns out, if the sum() over the column is bigger than MAX_INT = 2147483647, then avg() will throw this error. This surprised me a little - I guess I was assuming sql server would use bigint internally when computing integer avg, but it uses int.

so you have to do this:

Code:
select avg(cast(mytinyint as bigint)) from mytable
what's annoying to me is, it's impossible that an average of type T could be larger or smaller than the max/min allowed value of a type T. This means sql server devs could have internally always used bigint for sum and count, and then returned cast(sum/count as T) from avg() without possibility of overflow (unless sum or count was bigger than the max bigint! you are on your own if that happens i guess!).

Because of this, you have to cast your column to a bigint to make it work if your dataset is large enough, even if the column is only 1 measly byte! kind of a waste of space if you ask me.

here's a little example of what I'm talking about, that doesn't require you to import a 700m row dataset

Code:
declare @t table (id int)
insert into @t select 1 union all select 2147483647
select avg(cast(id as bigint)) from @t  -- works
select avg(id) from @t -- fails
__________________
elsasoft.org
Reply With Quote
  #2 (permalink)  
Old 10-31-06, 19:01
rudra rudra is offline
L O S T in Reality
 
Join Date: Nov 2005
Location: Bangalore,India
Posts: 505
hmm....
__________________
Success is the ability to go from one failure to another with no loss of enthusiasm.
- Sir Winston Churchill
Joydeep
Reply With Quote
  #3 (permalink)  
Old 10-31-06, 22:49
blindman blindman is offline
World Class Flame Warrior
 
Join Date: Jun 2003
Location: Ohio
Posts: 10,830
Very interesting. I will check it out tomorrow. Thanks for posting it.
__________________
If it's not practically useful, then it's practically useless.

blindman
www.chess.com: "sqlblindman"
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On