Results 1 to 12 of 12

Thread: searching...

  1. #1
    Join Date
    Feb 2004
    Location
    Bangalore, India
    Posts
    250

    Unanswered: searching...

    my app use a registration table

    StudReg
    (
    stName varchar(30),
    stDOB smalldatetime,
    stGuardianName varchar(30),
    stRegDt smalldatetime,
    stRegNo bigint,
    courseId smallint
    )


    the application registers the student details to a course.
    each student gets a new registration no during registration.

    the app should identify repeaters to a particular course by checking another
    table RegHistory, which stores the details of student registrations for the previous 5 years.

    RegHistory
    (
    stName varchar(30),
    stDOB smalldatetime,
    stGuardianName varchar(30),
    stPrevRegDt smalldatetime,
    stPrevRegNo bigint,
    courseId smallint
    )


    the application must search the RegHistory table and list out those students who are the repeaters.


    the sample entries in the two tables are as follows

    StudReg
    stName stDOB stGuardianName stRegDt stRegNo courseid
    -------------------------------------------------------------------------
    abc 01/01/1979 def 20/11/2004 12345 1
    def 01/01/1976 xyz 20/11/2004 12346 1
    ... ..... ... ...... .... ...

    mno 24/18/1976 pqr 20/11/2004 12400 1



    RegHistory

    stName stDOB stGuardianName stPrevRegDt stPrevRegNo courseId
    abc 01/01/1979 def 20/11/2001 2345 1
    ghi 01/01/1976 xyz 20/11/2001 2346 1
    ... ..... ... ...... .... ...

    dfg 24/18/1976 pqr 20/11/2001 2400 1


    to determine whether a student is a repeater or not, we have to search for an exact match in RegHistory table (where the student name, guardian name and date of birth in both tables match with the corresponding entries in Registration table).


    here is my question,

    if there are 100,000 students registering in each academic year, we will have
    500,000 records in RegHistory Table and 100,000 records in studReg table

    if i start searching for a repeater, i guess i will have to loop through all records in studReg, for an exact match in RegHistory, which wil be a time consuming process.

    is there any other options to search for repeaters ?

    pl discuss
    Cheers....

    baburajv

  2. #2
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    there is another option -- don't loop

    use a join

    and make sure the join columns are indexed

    a compound covering index might be best
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  3. #3
    Join Date
    Jul 2003
    Location
    San Antonio, TX
    Posts
    3,662
    Unless you need info returned from the History table as well, all you need is ...WHERE EXISTS(...) against your History table.
    "The data in a record depends on the Key to the record, the Whole Key, and
    nothing but the Key, so help me Codd."

  4. #4
    Join Date
    Feb 2004
    Location
    Bangalore, India
    Posts
    250
    thanks for the reply

    i created a view as follows,

    CREATE VIEW
    REPEATERS_VW
    AS
    SELECT DISTINCT
    A.stName,A.stGuardianName ,A.courseId ,A.stPrevRegNo ,
    A.stPrevRegDt ,A.stDOB,
    B.stRegNo
    FROM
    RegHistory A, StudReg B
    WHERE
    A.stName= B.stName
    AND
    A.stGuardianName = B.stGuardianName
    AND
    A.stDOB = B.stDOB

    since i am a newcomer to MSSQL, shall i ask what a
    "compound covering index" is ?
    Cheers....

    baburajv

  5. #5
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    an index is compound if it has more than one column

    an index is a covering index if it contains all the columns needed to satisfy a query

    in this case, an index on (stName,stGuardianName,stDOB) in the history table would be sufficient to tell you whether the student is a repeater, and the query would then not have to access the history table itself at all
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  6. #6
    Join Date
    Jul 2003
    Location
    San Antonio, TX
    Posts
    3,662
    Hmmm, that's not entirely correct, and Rudy even said it himself why, - the table WOULD have to be accessed if only the 3 mentioned fields comprised the index. In order for the index to qualify as "covering" it needs to include ALL the fields referenced by the query, which are stName, stGuardianName, courseId, stPrevRegNo, stPrevRegDt, and stDOB from RegHistory (RegHistory.idx_RegHistory_Covered_Index_For_All_F ields), and stRegNo, stName, stGuardianName, and stDOB from StudReg table (StudReg.idx_StudReg_Covered_Index_For_All_Fields) . Never underestimate the cost of the Bookmark Lookup!
    "The data in a record depends on the Key to the record, the Whole Key, and
    nothing but the Key, so help me Codd."

  7. #7
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    yeah, true, but only in the case of the query in post #4, which, in my opinion, is pulling the wrong fields

    if you go back and review the requirements in post #1, you need only those columns from the history table that need to be matched, i.e. the covering index of those 3 columns
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  8. #8
    Join Date
    Jul 2003
    Location
    San Antonio, TX
    Posts
    3,662
    Beg to disagree, even based on the requirements from post #1, because you're still retrieving values from other fields, thus - Bookmark lookup is unavoidable without those fields being part of the index as well.
    "The data in a record depends on the Key to the record, the Whole Key, and
    nothing but the Key, so help me Codd."

  9. #9
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    no, i am not retrieving values from other fields!!

    i don't know why you keep saying that i am!!

    the original requirements are to identify the repeaters

    this query identifies the repeaters:
    Code:
    select reg.stName 
         , reg.stDOB 
         , reg.stGuardianName
      from StudReg as reg
     where exists
         ( select null
             from RegHistory as hist
            where hist.stName         = reg.stName 
              and hist.stDOB          = reg.stDOB 
              and hist.stGuardianName = reg.stGuardianName )
    where's the "bookmark lookup" there, please?
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  10. #10
    Join Date
    Jul 2003
    Location
    San Antonio, TX
    Posts
    3,662
    OK, maybe I read "with the corresponding entries in Registration table..." as "in addition to fields participating in JOIN". Besides, what good would 10 entries for the same student do if you do not include at least courseId to denote the fact that the student repeated 10 different courses?
    "The data in a record depends on the Key to the record, the Whole Key, and
    nothing but the Key, so help me Codd."

  11. #11
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    oh, man, i am such an idiot

    of course, you are right

    well, then, i guess we'll have to expand the covering index to include the course id

    i don't think the registration dates or registration numbers of previous registrations matter, as i would use the WHERE EXISTS structure

    but you're right, it makes no sense without the course id

    sorry for my pigheadedness
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  12. #12
    Join Date
    Jul 2003
    Location
    San Antonio, TX
    Posts
    3,662
    Hey, if it were not for you many questions would have been left unanswered or answered in a "half-ass" fashion!
    "The data in a record depends on the Key to the record, the Whole Key, and
    nothing but the Key, so help me Codd."

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •