12-16-13, 10:32 #1Registered User
- Join Date
- Dec 2013
Why would SQL cast have to be expensive if used on a join column?
I see that sql join internally functions always on some of the three basic operations : nested loops join, merge join and hash join .
From multiple articles on the internet I see SQL casting is said to be able to impact performances severelly and almost always the reason is it disables index usage.
But to me it looks it could always be avoided:
Casting could disable merge join usage (and merge benefits from index usage) but hash join, that is said to be generally faster than merge, can still be utilized in the same way(looks to me). No matter in which phase of the hash join (build of probe) the hash value of the casted column is calculated. All that has to be done is casting, before hashing, which is done for every single value.
If I am right the full hash join cycle is done for every sql join since permanent hash tables would be useful only if the join is repeatedly done on the same column.
Hence, from my aspect, casting of join columns has not performance impact but the casting time for every join column value, once, for both tables. And the guys on the internet say casting severely impacts performances since indexes cannot be used.
Is something wrong in my thinking above?
Thank you for the time
12-16-13, 11:09 #2Resident Curmudgeon
- Join Date
- Feb 2004
- In front of the computer
When you apply any function to a column, then any index based on that column is no longer eligible for use in a join operation. It wouldn't seem that Cast() ought to matter, but consider the difference between a number stored as a left justified string and one stored as a right justified string... The sort order as REAL would be quite different than the sort order as VARCHAR!
If the Cast appears on the "left" side of the join, then the cost is trivial since the source or probe sides of the join are only considered once per candidate row going into the join. If the Cast() appears on the "right" side of the join, then no index is eligible for this join... If no index had been eligible before, then you must apply the Cast() for every candidate row and that adds some cost to the join. If an index was eligible before the Cast() so what could have been a seek now becomes a scan, then the cost becomes huge.
As with many things in databases and programming in general, there isn't a "one size fits all" answer. This is why consultants and educators (especially Paul Randal) answer most questions with "it depends" instead of a simpler answer!
-PatPIn theory, theory and practice are identical. In practice, theory and practice are unrelated.