If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > Microsoft SQL Server > Join performances, when is each join best.

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 02-13-11, 14:24
Ingovals Ingovals is offline
Registered User
 
Join Date: Feb 2011
Posts: 1
Join performances, when is each join best.

Before I start I wan't it to be known that this is a homework assignment and I'm not expecting anyone to do it for me. However I've had some problem understanding things. I do not like the book I have and don't find that I'm understanding things correctly from it.

The project revolves around finding 3 queries that each are really good with a particular join method. Well not necessarily good, just much better then the others. The join methods are Merged, Hash and Index Nested Loop.

Finding some queries isn't that big of an issue, it's mostly understanding why it's so much quicker and as I only understand the basics it can be really hard to get it.

I have 2 databases with 3 candidate keys each (unique1, unique2 and unique3) each with 1.000.000 records. They have clustered index on unique2 and unclustered on unique1.

Now the first problem I encountered was with Index Nested Loop. I assumed you could get good results if the inner relation was a clustered index. So I have this query:

Code:
Select A.UNIQUE2 from [Project3].[dbo].Wisc1000Ka A 
join [Project3].[dbo].Wisc1000Kb B on A.UNIQUE2 = B.UNIQUE2
option( loop join )
Here both the inner and outer relation is on a clustered index. However it isn't really fast. Compared to:

Code:
Select A.UNIQUE1 from [Project3].[dbo].Wisc1000Ka A 
join [Project3].[dbo].Wisc1000Kb B on A.UNIQUE1 = B.UNIQUE2
option( loop join )
Which outer relation is non-clustered but still it's much faster.

How can this be. I'm obviously misunderstanding everything here.

I might add more problems if someone is able to help me with this one.
Reply With Quote
  #2 (permalink)  
Old 02-16-11, 08:57
Kaiowas Kaiowas is offline
Registered User
 
Join Date: Feb 2004
Posts: 488
In short, IMHO, a Hash join performs well on large joins, a merged join on a moderate size and a nested loop works well on small sets. I suppose it all comes down on how sqlserver chooses it's join and exactly how that join logic is implemented, also considering indices/data that is available.
Reply With Quote
  #3 (permalink)  
Old 02-17-11, 16:45
pootle flump pootle flump is offline
King of Understatement
 
Join Date: Feb 2004
Location: One Flump in One Place
Posts: 14,905
I'm afraid the information you provide is not very clear.
Quote:
Originally Posted by Ingovals View Post
I have 2 databases with 3 candidate keys each (unique1, unique2 and unique3) each with 1.000.000 records. They have clustered index on unique2 and unclustered on unique1.
Could you supply ALL the DDL for your lab? Also it would be useful to see the code (if there is any) to generate the data.

"Speed" (i.e. elapsed time) is a very poor measure when evaluating queries. Can you post instead the results of
Code:
SET STATISTICS PROFILE, IO ON
for the queries instead?

This guy's blog series on joins is awesome (start at the bottom):
Browse by Tags - Craig Freedman's SQL Server Blog - Site Home - MSDN Blogs
__________________
Testimonial:
Quote:
pootle flump
ur codings are working excelent.
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On