Results 1 to 10 of 10
  1. #1
    Join Date
    Mar 2007
    Posts
    97

    Unanswered: Index / Join / Where clause very slow

    Hello,

    first of all, some facts of the case:


    Code:
     
    Table Master                         Table    Dimension
    ID   Code        Price                ID     Name
    1    A44333      5000                1     "Scanner"
    2    D442        3000                2     "Notebook"
    3    D6644       4000                3     "Banana"
    I join both tables on ID and search one time for ID and another time for Name. Looks like
    Code:
    (a)
    SELECT AVG(Price) From Master JOIN Dimension ON Master.id = Dimension.id
    WHERE master.id=1
    AND Code like 'A44'
    (b)
    SELECT AVG(Price) From Master JOIN Dimension ON Master.id = Dimension.id
    WHERE Name = 'Scanner'
    AND Code like 'A44'
    Why does query (b) take longer than query (a)? Dimension has 12 Rows and
    Master has about 24M Rows.

    For index I did
    Code:
    Create Index IX_Master_ID on Master(ID)
    Create Index IX_Master_Code on Master(Code)
    Create Index IX_Dimension_ID on Dimension(ID)
    Create Index IX_Dimension_Name on Dimension(Name)
    I noticed, that when i leave the Code like 'A44' clause, query (a) and (b) do take same time. I'm really confused. Can someone please help me out?

    Thank you

    Silas

  2. #2
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    like 'A44' should really be = 'A44'.

    Have you looked at the query plans or run it with set statistics io on?

    The efficiency will improve also if the optimiser knows which indexes are unique. I suspect (but don't know) that the below two are unique:
    Code:
    Create Index IX_Dimension_ID on Dimension(ID)
    Create Index IX_Dimension_Name on Dimension(Name)
    I also suspect a composite index on master would be unique:

    Code:
     Create Index IX_Master_ID_Code on Master(ID, Code)
    Testimonial:
    pootle flump
    ur codings are working excelent.

  3. #3
    Join Date
    Mar 2007
    Posts
    97
    Quote Originally Posted by pootle flump
    like 'A44' should really be = 'A44'.
    like 'A44%' :-)
    Both ID's are not unique. Of course there is a unique field but the ID in this example is just kind of a foreign key. Moreover, there are a lot of Dimensions and much much more data fields. I think it was 14 foreign keys and about 40 data fields. But I wanted to keep things simple, so I did not mention.

    EDIT: Oh, my fault. In Dimension table, ID and NAME are unique!


    I did take a look at the query plan, but I can't really make sense of this.
    I see, that both plans are not equal, when I keep the like clause.

    But i found out something very interesting. If I create a composite index , as you suggested, and delete IX_Master_ID and IX_Master_Code, so that the server only uses the composite index, then both queries have the same execution time.

    Finally, this is a big problem. To grant best execution times, I will have to create a composite index, that includes nearly all columns that can be affected by a user query. Is this usual?
    Last edited by silas; 07-15-07 at 10:37.

  4. #4
    Join Date
    Jan 2007
    Location
    UK
    Posts
    11,434
    Provided Answers: 10
    Here's my 2p;
    Should this not really be a LEFT JOIN?

    In query B you need it to read Dimension.Name (could be an ambiguous column name - certainly sounds it!)
    EDIT: 'name' is also a reserved word - you should avoid using this!

    As mentioned before - your LIKE clause is wrong.
    Using a LIKE comparison means that your index on the column are ignored.
    George
    Home | Blog

  5. #5
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    What is the Count(*) value associated with the averages in your query? Your sample data doesn't help me get a handle on the query scale.

    It would help me a bunch if you could execute:
    Code:
    SET SHOWPLAN_TEXT ON
    GO
    meta_your_SQL_goes_here
    GO
    SET SHOWPLAN_TEXT OFF
    GO
    ...and post the results so we could see just what your server is doing.

    My first guess is bad statistics, but that's only a guess at this point.

    Oh, and by the way George:
    Quote Originally Posted by georgev
    Using a LIKE comparison means that your index on the column are ignored.
    is not always true. If the LIKE is unambiguous on the left (meaning there are no leading wildcards), then a LIKE can ride an index.

    -PatP

  6. #6
    Join Date
    Mar 2007
    Posts
    97
    Now I'm totally confused. I did another index on Price and what happens, query (b) now is 3 times faster that (a)

    Indexes are
    PK_objects_year_quarter_obid
    IX_objects_price
    IX_objects_eid
    IX_objects_zip

    IX_dim_estate_estate
    PK_dim_estate_eid

    PLAN A
    Code:
      |--Compute Scalar(DEFINE:([Expr1006]=CASE WHEN [globalagg1008]=(0) THEN NULL ELSE [globalagg1010]/CONVERT_IMPLICIT(decimal(19,0),[globalagg1008],0) END))
           |--Stream Aggregate(DEFINE:([globalagg1008]=SUM([partialagg1007]), [globalagg1010]=SUM([partialagg1009])))
                |--Nested Loops(Inner Join)
                     |--Clustered Index Seek(OBJECT:([testdb].[dbo].[dim_estate].[PK_dim_Estate_eid]), SEEK:([testdb].[dbo].[dim_estate].[EID]=(3.)) ORDERED FORWARD)
                     |--Stream Aggregate(DEFINE:([partialagg1007]=COUNT_BIG([testdb].[dbo].[Objects].[PRICE]), [partialagg1009]=SUM([testdb].[dbo].[Objects].[PRICE])))
                          |--Nested Loops(Inner Join, OUTER REFERENCES:([testdb].[dbo].[Objects].[YEAR], [testdb].[dbo].[Objects].[QUARTER], [testdb].[dbo].[Objects].[OBID], [Expr1012]) OPTIMIZED WITH UNORDERED PREFETCH)
                               |--Merge Join(Inner Join, MERGE:([testdb].[dbo].[Objects].[YEAR], [testdb].[dbo].[Objects].[QUARTER], [testdb].[dbo].[Objects].[OBID])=([testdb].[dbo].[Objects].[YEAR], [testdb].[dbo].[Objects].[QUARTER], [testdb].[dbo].[Objects].[OBID]), RESIDUAL:([testdb].[dbo].[Objects].[YEAR] = [testdb].[dbo].[Objects].[YEAR] AND [testdb].[dbo].[Objects].[QUARTER] = [testdb].[dbo].[Objects].[QUARTER] AND [testdb].[dbo].[Objects].[OBID] = [testdb].[dbo].[Objects].[OBID]))
                               |    |--Sort(ORDER BY:([testdb].[dbo].[Objects].[YEAR] ASC, [testdb].[dbo].[Objects].[QUARTER] ASC, [testdb].[dbo].[Objects].[OBID] ASC))
                               |    |    |--Index Seek(OBJECT:([testdb].[dbo].[Objects].[IX_objects_zip]), SEEK:([testdb].[dbo].[Objects].[ZIP] >= N'44' AND [testdb].[dbo].[Objects].[ZIP] < N'45'),  WHERE:([testdb].[dbo].[Objects].[ZIP] like N'44%') ORDERED FORWARD)
                               |    |--Index Seek(OBJECT:([testdb].[dbo].[Objects].[IX_objects_eid]), SEEK:([testdb].[dbo].[Objects].[EID]=(3.)) ORDERED FORWARD)
                               |--Clustered Index Seek(OBJECT:([testdb].[dbo].[Objects].[PK_Objects_year_quarter_obid]), SEEK:([testdb].[dbo].[Objects].[YEAR]=[testdb].[dbo].[Objects].[YEAR] AND [testdb].[dbo].[Objects].[QUARTER]=[testdb].[dbo].[Objects].[QUARTER] AND [testdb].[dbo].[Objects].[OBID]=[testdb].[dbo].[Objects].[OBID]) LOOKUP ORDERED FORWARD)
    PLAN B

    Code:
      |--Compute Scalar(DEFINE:([Expr1006]=CASE WHEN [globalagg1008]=(0) THEN NULL ELSE [globalagg1010]/CONVERT_IMPLICIT(decimal(19,0),[globalagg1008],0) END))
           |--Stream Aggregate(DEFINE:([globalagg1008]=SUM([partialagg1007]), [globalagg1010]=SUM([partialagg1009])))
                |--Nested Loops(Inner Join, OUTER REFERENCES:([testdb].[dbo].[Objects].[EID]))
                     |--Hash Match(Aggregate, HASH:([testdb].[dbo].[Objects].[EID]), RESIDUAL:([testdb].[dbo].[Objects].[EID] = [testdb].[dbo].[Objects].[EID]) DEFINE:([partialagg1007]=COUNT_BIG([testdb].[dbo].[Objects].[PRICE]), [partialagg1009]=SUM([testdb].[dbo].[Objects].[PRICE])))
                     |    |--Hash Match(Inner Join, HASH:([testdb].[dbo].[Objects].[YEAR], [testdb].[dbo].[Objects].[QUARTER], [testdb].[dbo].[Objects].[OBID])=([testdb].[dbo].[Objects].[YEAR], [testdb].[dbo].[Objects].[QUARTER], [testdb].[dbo].[Objects].[OBID]), RESIDUAL:([testdb].[dbo].[Objects].[YEAR] = [testdb].[dbo].[Objects].[YEAR] AND [testdb].[dbo].[Objects].[QUARTER] = [testdb].[dbo].[Objects].[QUARTER] AND [testdb].[dbo].[Objects].[OBID] = [testdb].[dbo].[Objects].[OBID]))
                     |         |--Hash Match(Inner Join, HASH:([testdb].[dbo].[Objects].[YEAR], [testdb].[dbo].[Objects].[QUARTER], [testdb].[dbo].[Objects].[OBID])=([testdb].[dbo].[Objects].[YEAR], [testdb].[dbo].[Objects].[QUARTER], [testdb].[dbo].[Objects].[OBID]), RESIDUAL:([testdb].[dbo].[Objects].[YEAR] = [testdb].[dbo].[Objects].[YEAR] AND [testdb].[dbo].[Objects].[QUARTER] = [testdb].[dbo].[Objects].[QUARTER] AND [testdb].[dbo].[Objects].[OBID] = [testdb].[dbo].[Objects].[OBID]))
                     |         |    |--Index Seek(OBJECT:([testdb].[dbo].[Objects].[IX_objects_zip]), SEEK:([testdb].[dbo].[Objects].[ZIP] >= N'44' AND [testdb].[dbo].[Objects].[ZIP] < N'45'),  WHERE:([testdb].[dbo].[Objects].[ZIP] like N'44%') ORDERED FORWARD)
                     |         |    |--Index Scan(OBJECT:([testdb].[dbo].[Objects].[IX_objects_eid]))
                     |         |--Index Scan(OBJECT:([testdb].[dbo].[Objects].[IX_objects_price]))
                     |--Clustered Index Seek(OBJECT:([testdb].[dbo].[dim_estate].[PK_dim_Estate_eid]), SEEK:([testdb].[dbo].[dim_estate].[EID]=[testdb].[dbo].[Objects].[EID]),  WHERE:([testdb].[dbo].[dim_estate].[Estate]='Villa') ORDERED FORWARD)

    Hope you can read this language ;-)

  7. #7
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    Your second query is much more explicit, it queries a much smaller number of rows on the estate dimension.

    Once both queries are executed a few times so that the relevant data is loaded into cache, I would expect the queries to both run quickly, but the second one ought to run faster than the first one no matter how much RAM buffer you've got.

    -PatP

  8. #8
    Join Date
    Mar 2007
    Posts
    97
    Thanks Pat, but isn't this strange? Look at this

    (A)
    Code:
    Select AVG(Price) from dbo.objects
    where eid=3 and zip like '44%'
    (34 seconds)

    is slower than

    (B)
    Code:
    Select AVG(Price) 
    from dbo.objects join dbo.dim_estate on dbo.objects.eid=dbo.dim_estate.eid
    where estate = 'Villa' and zip like '44%'
    (13 seconds)

    This doesn't make sense to me. Is it because of Microsoft SQL Server?
    Ok, finally, the fastest query is the more userfriendly query (user does not have to know the right eid) but it still leaves kind of a bad taste. Just as you said (B) is better than (A).
    Another question, is there something like Oracles index only table in sql server?

  9. #9
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    There are apparently many things about your schema that are not intuitively obvious, based on the showplan output versus the code snippets that you've posted. My guess is that you've "simplified" the code snippets in some way, but posted the showplan output as it is actually generated by the SQL engine. Without a lot more "inside knowledge" to help me understand the differences, I can't offer a useful opinion.

    The MS-SQL and Oracle database engines are radically different in the way that they do things. Each has its strong and weak points, neither is intrinsically "better" or "worse" than the other, they are just different. Because of the differences in implementation, I can't think of anything quite like the Oracle index only table in the Microsoft SQL environment for this example.

    -PatP

  10. #10
    Join Date
    Mar 2007
    Posts
    97
    I only snipped Plan A. Showplan (b) belongs to Query (b). Nothing changed there. The Code for A originaly was

    Code:
    Select AVG(Price) from 
    dbo.objects join dbo.dim_estate on dbo.objects.eid=dbo.dim_estate.eid 
    where dbo.objects.eid=3 and zip like '44%'
    Obviously, I do not really need the join here, that's why i left it. The execution time in both cases is the same for (a).

    Anyway, here is the showplan for snipped (a) (without the fat text)

    Code:
      |--Compute Scalar(DEFINE:([Expr1003]=CASE WHEN [Expr1010]=(0) THEN NULL ELSE [Expr1011]/CONVERT_IMPLICIT(decimal(19,0),[Expr1010],0) END))
           |--Stream Aggregate(DEFINE:([Expr1010]=COUNT_BIG([testdb].[dbo].[Objects].[PRICE]), [Expr1011]=SUM([testdb].[dbo].[Objects].[PRICE])))
                |--Nested Loops(Inner Join, OUTER REFERENCES:([testdb].[dbo].[Objects].[YEAR], [testdb].[dbo].[Objects].[QUARTER], [testdb].[dbo].[Objects].[OBID], [Expr1009]) WITH UNORDERED PREFETCH)
                     |--Merge Join(Inner Join, MERGE:([testdb].[dbo].[Objects].[YEAR], [testdb].[dbo].[Objects].[QUARTER], [testdb].[dbo].[Objects].[OBID])=([testdb].[dbo].[Objects].[YEAR], [testdb].[dbo].[Objects].[QUARTER], [testdb].[dbo].[Objects].[OBID]), RESIDUAL:([testdb].[dbo].[Objects].[YEAR] = [testdb].[dbo].[Objects].[YEAR] AND [testdb].[dbo].[Objects].[QUARTER] = [testdb].[dbo].[Objects].[QUARTER] AND [testdb].[dbo].[Objects].[OBID] = [testdb].[dbo].[Objects].[OBID]))
                     |    |--Sort(ORDER BY:([testdb].[dbo].[Objects].[YEAR] ASC, [testdb].[dbo].[Objects].[QUARTER] ASC, [testdb].[dbo].[Objects].[OBID] ASC))
                     |    |    |--Index Seek(OBJECT:([testdb].[dbo].[Objects].[IX_objects_zip]), SEEK:([testdb].[dbo].[Objects].[ZIP] >= N'44' AND [testdb].[dbo].[Objects].[ZIP] < N'45'),  WHERE:([testdb].[dbo].[Objects].[ZIP] like N'44%') ORDERED FORWARD)
                     |    |--Index Seek(OBJECT:([testdb].[dbo].[Objects].[IX_objects_eid]), SEEK:([testdb].[dbo].[Objects].[EID]=(3.)) ORDERED FORWARD)
                     |--Clustered Index Seek(OBJECT:([testdb].[dbo].[Objects].[PK_Objects_year_quarter_obid]), SEEK:([testdb].[dbo].[Objects].[YEAR]=[testdb].[dbo].[Objects].[YEAR] AND [testdb].[dbo].[Objects].[QUARTER]=[testdb].[dbo].[Objects].[QUARTER] AND [testdb].[dbo].[Objects].[OBID]=[testdb].[dbo].[Objects].[OBID]) LOOKUP ORDERED FORWARD)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •