Not a question, just a post about something I ran into today that surprised me a little.
I have a huge dataset - about 700m rows, one column of which is a tinyint. I needed to take an average of that column, so I did this:
select avg(mytinyint) from mytable
Msg 8115, Level 16, State 2, Line 6
Arithmetic overflow error converting expression to data type int.
It turns out, if the sum() over the column is bigger than MAX_INT = 2147483647, then avg() will throw this error. This surprised me a little - I guess I was assuming sql server would use bigint internally when computing integer avg, but it uses int.
so you have to do this:
select avg(cast(mytinyint as bigint)) from mytable
what's annoying to me is, it's impossible that an average of type T could be larger or smaller than the max/min allowed value of a type T. This means sql server devs could have internally always used bigint for sum and count, and then returned cast(sum/count as T) from avg() without possibility of overflow (unless sum or count was bigger than the max bigint! you are on your own if that happens i guess!).
Because of this, you have to cast your column to a bigint to make it work if your dataset is large enough, even if the column is only 1 measly byte! kind of a waste of space if you ask me.
here's a little example of what I'm talking about, that doesn't require you to import a 700m row dataset
declare @t table (id int)
insert into @t select 1 union all select 2147483647
select avg(cast(id as bigint)) from @t -- works
select avg(id) from @t -- fails