Results 1 to 2 of 2
  1. #1
    Join Date
    Sep 2002
    Location
    Brazil
    Posts
    56

    Unanswered: Not In X Not Exists

    HY guys,
    New problem in my way!

    I`m creating a Procedure tha take the first 50th lines, then then the next 50th ....

    OK..
    I decided to use NOT IN like in the script below. I worked perfectly... But we know that NOT IN is not very good to SQL. Then I tried to make the same query using NOT EXISTS but it didn`t work.

    Example:
    CREATE TABLE test (
    tesID int NOT NULL,
    tesField varchar(50) NOT NULL
    )

    Insert into test (tesID, tesField) VALUES (1,'test1')
    Insert into test (tesID, tesField) VALUES (2,'test2')
    Insert into test (tesID, tesField) VALUES (3,'test3')
    Insert into test (tesID, tesField) VALUES (4,'test4')
    Insert into test (tesID, tesField) VALUES (5,'test6')

    SELECT TOP 50 *
    FROM test
    WHERE tesID NOT IN (SELECT TOP 50 tesID FROM test ORDER BY tesID)
    ORDER BY tesID


    SELECT TOP 50 *
    FROM test P1
    WHERE NOT EXISTS
    (SELECT TOP 50 * FROM test P2 WHERE P1.tesID = P2.tesID)


    Some suggestions?
    Diogo Hudson Dias
    DBA - SQL 6.5/7.0/2000
    ASBACE - ATP

  2. #2
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    the problem with any algorithm for paging (get the first N, then get the next N, and so on) is that in order to return the correct group of N records, you have to pass into the query some kind of parameter value that determines which group of N records

    the first time, you can just say "select top N IDs"

    the second time, you can say "select top N IDs where ID isn't in the top N IDs"

    however, the third time you do it, you have to say "select top N IDs where ID isn't in the top 2 * N IDs"

    so you can see, the general query is "select top N IDs where ID isn't in the top K * N IDs" where K starts at 0

    in other words, you always have to pass in a value for K

    however, the NOT IN (SELECT TOP K*N) is inefficient

    another way to approach it is to pass in the "last previous value"

    thus, the first time, you can again say "select top N IDs"

    make a note of the value of the ID for the Nth record

    then the second time, you can say "select top N IDs where ID is less than value" where you pass in the ID of the Nth record

    make a note of the value of the (2*N)th ID

    now, the third time you do it, you can again say "select top N IDs where ID is less than value"

    which do you think will be faster? you betcha, passing in an ID value rather than re-calculating a subquery to count rows right from the very top

    furthermore, passing in a value for the ID, rather than a record number, makes the query useful for other purposes (e.g. list first 50 suppliers starting with the letter "M")


    rudy
    http://rudy.ca/

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •