Results 1 to 7 of 7
  1. #1
    Join Date
    Feb 2004
    Location
    Hamburg, Germany
    Posts
    22

    Unanswered: Help with nearest neighbour problem

    Hi,

    I need help with creating an sql statement that determines the nearest neighbour in a lookup table to my dataset by its date column. My attempts minimizing the date difference in a cross join are lacking performance.

    I have two tables:

    Table 1 (Data; 13000 datasets):

    ID, date
    ==========
    1, 12.12.2006
    2, 28.12.2006
    3, 05.01.2007

    and Table 2 (Lookup; 4000 datasets):

    date, margin
    ==========
    05.12.2006, 2.80
    27.12.2006, 2.86
    01.01.2007, 3.01
    10.01.2007, 2.99

    Expected result:

    ID, date, margin
    ==========
    1, 12.12.2006, 2.80
    2, 28.12.2006, 2.86
    3, 05.01.2007, 3.01

    Any help is much desired

    Kai

  2. #2
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    could you explain please how to calculate "nearest"
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  3. #3
    Join Date
    Feb 2004
    Location
    Hamburg, Germany
    Posts
    22
    Quote Originally Posted by r937
    could you explain please how to calculate "nearest"
    For a given date in the data table I'm looking for the closest date in the lookup table, e.g. MIN(Datediff(dd,date(data),date(lookup))).

  4. #4
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    that datediff might produce negative numbers, and MIN will take the largest negative number

    do you perhaps mean MIN(ABS(...)) ?
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  5. #5
    Join Date
    Feb 2004
    Location
    Hamburg, Germany
    Posts
    22
    Quote Originally Posted by r937
    that datediff might produce negative numbers, and MIN will take the largest negative number

    do you perhaps mean MIN(ABS(...)) ?
    You're right, it's MIN(ABS(...)). But how do I integrate this into a view to do effective lookups?

  6. #6
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    effective? i would imagine this to depend on the existence of appropriate indexes

    the following works (i tested it on your data) but i dunno how slow it's gonna be for your large tables...
    Code:
    with X 
         ( ID
         , TDate
         , a
         , LDate
         , margin
         )
    as ( 
    select T.ID
         , T.Date  as TDate
         , abs(datediff(dd,T.Date,L.Date)) as a
         , L.Date  as LDate
         , L.margin
      from table1 as T
    cross
      join lookup as L   
       ) 
    select ID
         , TDate
         , a
         , LDate
         , margin
      from X as D1
     where a = 
        ( select min(a)
            from X
           where ID = D1.ID )
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  7. #7
    Join Date
    Feb 2004
    Location
    Hamburg, Germany
    Posts
    22
    That's quite some nifty code. Unfortunately I had to rewrite the code to not use the "WITH" statement as MS SQL Server 2000 apparently doesn't support this. Anyway I've came down to 1 min. processing time from 57 min. without touching the indexes, so thanks a lot.

    This is my final code:
    Code:
    Select ID
         , TDate
         , a
         , LDate
         , margin
    from
    (
      select T.ID
         , T.Date  as TDate
         , abs(datediff(dd,T.Date,L.Date)) as a
         , L.Date  as LDate
         , L.margin
      from table1 as T
      cross
        join lookup as L   
    ) X
    where a = 
    (
      select min(abs(datediff(dd,T.Date,L.Date))) as amin
      from table1 as T
      cross
        join lookup as L   
      where T.ID = X.ID
    )

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •