Results 1 to 4 of 4
  1. #1
    Join Date
    Sep 2001
    Location
    Chicago, Illinois, USA
    Posts
    601

    Unanswered: Syntax to abandone a SELECT on first occurance of criteria. Possible?

    Have a need to scan a large table to see if a set of criteria have ever been met.

    If/when the scan hits its first record meeting the criteria, the scan can be abandoned.

    Is there some syntax/option that accomplishes this?


    Right now, I am doing a SELECT with criteria against the table and the @@ROWCOUNT gives me a zero, or non-zero value.

    But that methodology means that the SELECT has to execute against the entirety of the table.

    I'd like to abandon the SELECT as soon as it detects a first record meeting the criteria.


    This is an optimization exercise.


    Thanks.

    Ken
    Ken

    Maverick Software Design

    (847) 864-3600 x2

  2. #2
    Join Date
    Jan 2007
    Location
    UK
    Posts
    11,434
    Provided Answers: 10
    Hi Ken,

    First thing we must clear up is that "first" has no meaning without an order specified.

    Luckily there's a method that acts pretty much how you've described anyway: [b]EXISTS[/]. This can be used in a number of different ways but I guess the best way for your question is this:
    Code:
    IF EXISTS (SELECT * FROM your_table WHERE some_criteria = 'true')
      BEGIN
        PRINT 'some_criteria = true!';
      END
    Because you only have to find one item to satisfy the condition, that's all the query will do.

    Aside: "NOT EXISTS" won't do as well because to prove that something doesn't exist, you have to check them all!
    George
    Home | Blog

  3. #3
    Join Date
    Sep 2001
    Location
    Chicago, Illinois, USA
    Posts
    601
    Yes, my use of the word "first" was unfortunate. I should have used the phrase "first occurrence" (which should not imply any data ordering).

    But does EXISTS really modify the action of the SELECT, or does it simply take the results from the SELECT and evaluate if the result is zero, or non-zero? If all it does is evaluate the result of the SELECT, then it is not "abandoning" its scan upon hitting a "first occurrence" hit.

    Just to make my point, let me suggest an extreme (non-realistic) example. Let's suppose we have a non-indexed table of 100 million records and 1% of those records meet our SELECT criteria. Assuming a random distribution of those records meeting our criteria, we can expect to read one of those records by the time we have read through a million records. By abandoning the search at that point we've saved reading through 99% of our data.

    So, do EXISTS modify the action of the specified SELECT such that its search is abandoned upon a "first occurrence" hit?

    Thanks.
    Ken

    Maverick Software Design

    (847) 864-3600 x2

  4. #4
    Join Date
    Jan 2007
    Location
    UK
    Posts
    11,434
    Provided Answers: 10
    Best thing you can do is compare the query plans. I just ran a test similar to what you described (forced an index scan [sorry, no heaps that big here!] by applying a function to the column) on 15M records using a couple of methods.
    Code:
    SELECT TOP 1
           'Method 1'
    FROM   dbo.customers
    WHERE  SubString(lastname, 5, 1) = 'z'
    
    SELECT 'Method 2'
    WHERE   EXISTS (SELECT * FROM dbo.customers WHERE SubString(lastname, 5, 1) = 'z')
    
    IF EXISTS (SELECT * FROM dbo.customers WHERE SubString(lastname, 5, 1) = 'z')
      BEGIN
        SELECT 'Method 3'
      END
    The query plan for 2 and 3 are the same (as expected) but the first plan is different..

    1st performs an index scan then applies Top (no order specified).
    2nd and 3rd perform a constant scan (cost 0%?) and an index scan and then a nested loop).

    The statistics tell a slightly different story:
    Table 'CUST'. Scan count 1, logical reads 32, physical reads 0, read-ahead reads 270, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'CUST'. Scan count 1, logical reads 32, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'CUST'. Scan count 1, logical reads 32, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
    This suggests the second and third methods are better in this instance.
    Statistic time suggests the same.


    Obviously this is not a thorough test but should give you something to play with.
    George
    Home | Blog

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •