Results 1 to 3 of 3
  1. #1
    Join Date
    Feb 2011
    Posts
    1

    Unanswered: Join performances, when is each join best.

    Before I start I wan't it to be known that this is a homework assignment and I'm not expecting anyone to do it for me. However I've had some problem understanding things. I do not like the book I have and don't find that I'm understanding things correctly from it.

    The project revolves around finding 3 queries that each are really good with a particular join method. Well not necessarily good, just much better then the others. The join methods are Merged, Hash and Index Nested Loop.

    Finding some queries isn't that big of an issue, it's mostly understanding why it's so much quicker and as I only understand the basics it can be really hard to get it.

    I have 2 databases with 3 candidate keys each (unique1, unique2 and unique3) each with 1.000.000 records. They have clustered index on unique2 and unclustered on unique1.

    Now the first problem I encountered was with Index Nested Loop. I assumed you could get good results if the inner relation was a clustered index. So I have this query:

    Code:
    Select A.UNIQUE2 from [Project3].[dbo].Wisc1000Ka A 
    join [Project3].[dbo].Wisc1000Kb B on A.UNIQUE2 = B.UNIQUE2
    option( loop join )
    Here both the inner and outer relation is on a clustered index. However it isn't really fast. Compared to:

    Code:
    Select A.UNIQUE1 from [Project3].[dbo].Wisc1000Ka A 
    join [Project3].[dbo].Wisc1000Kb B on A.UNIQUE1 = B.UNIQUE2
    option( loop join )
    Which outer relation is non-clustered but still it's much faster.

    How can this be. I'm obviously misunderstanding everything here.

    I might add more problems if someone is able to help me with this one.

  2. #2
    Join Date
    Feb 2004
    Posts
    492
    In short, IMHO, a Hash join performs well on large joins, a merged join on a moderate size and a nested loop works well on small sets. I suppose it all comes down on how sqlserver chooses it's join and exactly how that join logic is implemented, also considering indices/data that is available.

  3. #3
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    I'm afraid the information you provide is not very clear.
    Quote Originally Posted by Ingovals View Post
    I have 2 databases with 3 candidate keys each (unique1, unique2 and unique3) each with 1.000.000 records. They have clustered index on unique2 and unclustered on unique1.
    Could you supply ALL the DDL for your lab? Also it would be useful to see the code (if there is any) to generate the data.

    "Speed" (i.e. elapsed time) is a very poor measure when evaluating queries. Can you post instead the results of
    Code:
    SET STATISTICS PROFILE, IO ON
    for the queries instead?

    This guy's blog series on joins is awesome (start at the bottom):
    Browse by Tags - Craig Freedman's SQL Server Blog - Site Home - MSDN Blogs
    Testimonial:
    pootle flump
    ur codings are working excelent.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •