Page 1 of 2 12 LastLast
Results 1 to 15 of 17
  1. #1
    Join Date
    Jun 2004
    Location
    Tel Aviv - Israel
    Posts
    40

    Unanswered: Strage behavior when using TOP

    Hi

    I'm using MSSQL 2000 (with SP3) on Win2000.

    Now, before installing the service pack I encountered with several strange bugs in the MSSQL mostly in queries that used TOP, gladly they were all fixed when installing the service pack... or so I thought...
    So yesterday while trying to optimize a heavy query (7 joins - 2 of them are left joins from different tables crossed to the same table) I encountered yet again and with the latest service pack with an even stranger bug.
    First the returned records are just not always the same, meaning, for example when I use TOP 465 in the SELECT statement, the last record which is 465 contains some value, when using TOP 466 the record before the last which is record 465 contains different value!... of course both with the same ORDER BY clause.
    Also when I view the execution plan it's also not same in both cases, with TOP 465 it's one way (and much much faster) and with TOP 466 the plan is completely different and much slower...

    Does anyone encountered with this phenomenon? Any suggestions?
    BTW, don't pay too much attention to the number 465, in my case this is the border of the problem, after trying this query on different tables I found that each has it's own border that after it the TOP starts to freak out.

    Thanks for the help!

    Inon.

  2. #2
    Join Date
    Mar 2004
    Location
    Toronto
    Posts
    28
    Are the columns you are joining on all properly indexed?

    I saw a similar problem recently on a big join that did not have an index on one of its columns.

  3. #3
    Join Date
    Jun 2004
    Location
    Tel Aviv - Israel
    Posts
    40
    Quote Originally Posted by homer37
    Are the columns you are joining on all properly indexed?

    I saw a similar problem recently on a big join that did not have an index on one of its columns.
    Does it matter? the performance is not the issue right now, the different returned values are... but for your question, yes, the columns are indexed.

    I don't understand why there is a border of selected records(when using TOP) that before it the execution plan is efficient and after it not only that it's much slower(again, less disturbing) but the values are not the same, the first records are similar but somewhere in the middle they seize to be synchronized.
    I know it sounds confusing and indeed it's very strange, since the TOP cut occurs in the very end of the query, I don't see any reason why cutting a different bunch of records from the top should change both the execution plan and more disturbing the records (the part where they overlap).

    Inon.

  4. #4
    Join Date
    Mar 2004
    Location
    Toronto
    Posts
    28
    I was referring to the difference between the TOP 465 and TOP 466.

    I saw something similar with a big query with a TOP 2180 (fast) and TOP 2181 (very slow). We had hit a limit, and returning too much data without the proper indexing wasn't working.

  5. #5
    Join Date
    Jun 2004
    Location
    Tel Aviv - Israel
    Posts
    40
    Quote Originally Posted by homer37
    I was referring to the difference between the TOP 465 and TOP 466.

    I saw something similar with a big query with a TOP 2180 (fast) and TOP 2181 (very slow). We had hit a limit, and returning too much data without the proper indexing wasn't working.
    I see, sounds VERY similar, I don't understand the part were you say "We had hit a limit", why should it matter? Indexes affects only the speed of queries aren't they ??
    Why retrieving more data requires a different indexes? And how could I possibly know what new indexes too define?

    Can you please be more specific about how you finally solved the problem ?

    BTW, today the "border" of the problem was changed... in the same query on the same table it happens when trying to select 344 and 345 records... this indicates that system resources might also affect.

    Thank you very much for your help,
    I won't be able to respond for several hours.

    Inon.
    Last edited by Inoni; 06-06-04 at 05:43.

  6. #6
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    Quote Originally Posted by Inoni
    I see, sounds VERY similar, I don't understand the part were you sat "We had hit a limit", why should it matter? Indexes affects only the speed of queries aren't they ??
    Why retrieving more data requires a different indexes? And how could I possibly know what new indexes too define?
    The process for building a query plan is quite complex. The process is designed to deal with arbitrary queries, with wide ranges of both resources and schema configurations. Changing the TOP limit on the number of rows returned can make the optimizer choose a radically different query plan, because of the expected difference in performance.

    The indicies that are available affect the query plan, because different index configurations imply different performance. If you have two different indicies, one of which has exactly the column you need and one containing other columns that aren't needed by this query, then the smaller index is more efficient because it requries less I/O to find the appropriate row. In another query, because the additional columns provide index coverage, the wider index might be more efficient. The index coverage can affect the need to even fetch rows from a subsidiary table if the FK is available, which drastically changes the plan.

    The whole issue of indexing and query optimization is complex, and rarely gets the attention that it deserves. You are seeing the "tip of the iceberg", but there is a LOT of detail that is logically "below" what you can see!

    -PatP

  7. #7
    Join Date
    Mar 2004
    Location
    Toronto
    Posts
    28
    Hi Inoni, our problem was that one column on a big table in the join did not have an index. It seems that TOP 2180 was the best it could do quickly. The solution was to put an index on that column, and then things worked fine.

  8. #8
    Join Date
    Jun 2004
    Location
    Tel Aviv - Israel
    Posts
    40
    Quote Originally Posted by Pat Phelan
    The process for building a query plan is quite complex. The process is designed to deal with arbitrary queries, with wide ranges of both resources and schema configurations. Changing the TOP limit on the number of rows returned can make the optimizer choose a radically different query plan, because of the expected difference in performance.

    The indicies that are available affect the query plan, because different index configurations imply different performance. If you have two different indicies, one of which has exactly the column you need and one containing other columns that aren't needed by this query, then the smaller index is more efficient because it requries less I/O to find the appropriate row. In another query, because the additional columns provide index coverage, the wider index might be more efficient. The index coverage can affect the need to even fetch rows from a subsidiary table if the FK is available, which drastically changes the plan.

    The whole issue of indexing and query optimization is complex, and rarely gets the attention that it deserves. You are seeing the "tip of the iceberg", but there is a LOT of detail that is logically "below" what you can see!

    -PatP
    Well all this shouldn't affect the returned records, right???
    Why they are not the same?? I am referring to the first records were they supposed to overlap perfectly, meaning, when selecting TOP 100 and TOP 101, records 1 to 100 should be exactly the same, but they don't, why ? Is this not a bug?

    I really appreciate the help!

    Inon.

  9. #9
    Join Date
    Apr 2004
    Location
    Kansas City, MO
    Posts
    734
    How are you running this query? It sounds like it's dynamic, so are you running this from an application as dynamic SQL?

    I don't believe the piece on different results with TOP either. If you can provide an example, that would be great. There has to be somewhere in your query where you're allowing for these different results. Why don't you let us take a look at the query and see if we can see anything.
    MeanOldDBA
    derrickleggett@hotmail.com
    When life gives you a lemon, fire the DBA.

  10. #10
    Join Date
    Jun 2004
    Location
    Tel Aviv - Israel
    Posts
    40
    Quote Originally Posted by derrickleggett
    How are you running this query? It sounds like it's dynamic, so are you running this from an application as dynamic SQL?

    I don't believe the piece on different results with TOP either. If you can provide an example, that would be great. There has to be somewhere in your query where you're allowing for these different results. Why don't you let us take a look at the query and see if we can see anything.
    It's important to add that the differences in the results are minor. They are almost exactly the same, but only almost...
    The queries are dynamic, but to analyze the problem I use the query analyzer, so like I said, EXACTLY the same query with the minor change in the TOP statement retrieve strange results, I will try to be more specific, if I select with TOP 465, the ID of the last record is "12516", now, when I select with TOP 466, it should retrieve the same 465 records as before plus one more which is the next in order by the ORDER BY clause, BUT! Suddenly ID "12516" which was the 465 record in the order is now in place 464, which means that one record out of the 465 has disappeared.... this is the bug.
    Although it has a minor affect on our whole system and it won't create any problems and will be VERY difficult to even noticed, I still won't to figure it out...
    Last detail, the "border" when the record disappears, is when the execution plan changes.

    Anyway, Here is the query:

    SELECT TOP 465 tblXXX_UserYYY.UserListID, tblXXX_UserYYY.Title,
    tblXXX_UserYYY.List, tblXXX_UserYYY.ListDate,
    tblXXX_UserYYY.UserScore, TblUsers.UserID, TblUsers.UserName,
    TblUsers.VSPoints, TblUsers.UserRanking, TblUsers.IsManager, tblXXX.productID,
    TblManufacturers.ManufacturerID, TblManufacturers.ManufacturerName,
    tblXXX.ProductName, tblXXX.Model,
    TblCategoryManagers.TreeCategoryID AS ManagerCategoryID
    FROM tblXXX_UserYYY, TblUsers, tblXXX, TblManufacturers, TblCategoryManagers,
    TblBaseCategories
    WHERE TblCategoryManagers.UserID =* tblXXX_UserYYY.UserID AND
    TblBaseCategories.BaseCategoryID = tblXXX.BaseCategoryID AND
    TblBaseCategories.TreeCategoryID *= TblCategoryManagers.TreeCategoryID AND
    tblXXX_UserYYY.UserID = TblUsers.UserID AND
    tblXXX.productID = tblXXX_UserYYY.productID AND
    tblXXX.ManufacturerID = tblManufacturers.ManufacturerID AND
    TblUsers.IsPowerUser = 1
    ORDER BY ListDate DESC

    Inon.
    Last edited by Inoni; 06-07-04 at 06:52.

  11. #11
    Join Date
    Nov 2002
    Location
    Jersey
    Posts
    10,322
    ok...here goes...


    TOP sucks

    Dynamic SQL sucks

    NON ANSI Joins suck

    Surrogate keys suck

    Damn it's monday....damn hangover....

    WHAT are your really trying to do?

    Don't blame the optimizer...what you're doing is HIGHLY non deterministic...

    It's like saying I have a question for you and I want an answer...NOW...

    But I won't tell you what the question is...so what's the answer...

    I say the optimizer has done a pretty good job all in all...

    it's your job to make it clear enough....
    Brett
    8-)

    It's a Great Day for America everybody!

    dbforums Yak CorralRadio 'Rita
    dbForums Member List
    I'm Good Once as I ever was

    The physical order of data in a database has no meaning.

  12. #12
    Join Date
    Nov 2002
    Location
    Jersey
    Posts
    10,322
    Sorry.....

    rough weekend....

    Please tell us what you're requirements are....
    Brett
    8-)

    It's a Great Day for America everybody!

    dbforums Yak CorralRadio 'Rita
    dbForums Member List
    I'm Good Once as I ever was

    The physical order of data in a database has no meaning.

  13. #13
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Provided Answers: 1
    Surrogate keys suck?
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

  14. #14
    Join Date
    Nov 2002
    Location
    Jersey
    Posts
    10,322
    Quote Originally Posted by blindman
    Surrogate keys suck?

    MOO

    Their benefit is HIGHLY outwieghed by the misuse of them.

    Let me ask you...would you EVER put a clustered index on one?
    Brett
    8-)

    It's a Great Day for America everybody!

    dbforums Yak CorralRadio 'Rita
    dbForums Member List
    I'm Good Once as I ever was

    The physical order of data in a database has no meaning.

  15. #15
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    Quote Originally Posted by Brett Kaiser
    MOO

    Their benefit is HIGHLY outwieghed by the misuse of them.

    Let me ask you...would you EVER put a clustered index on one?
    Always.

    -PatP

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •