Results 1 to 8 of 8
  1. #1
    Join Date
    Dec 2003
    Posts
    3

    Question Unanswered: Trying to make a one-to-many join...Help please

    Simplified, I have two MS SQL Server tables that look like this:

    Investments
    [ ID ]
    [ shares ]

    Prices
    [ InvestID ]
    [ Date ]
    [ Price ]

    I want to retrieve the most recent price for each investment. Sample table values;

    Investment (ID, shares)
    Stock1, 100
    Stock1, 550
    Stock2, 300

    Prices (ID, date, price)
    Stock1, 05/30/2003, $15
    Stock1, 06/30/2003, $13
    Stock1, 07/31/2003, $18
    Stock2, 05/30/2003, $27
    Stock2, 06/30/2003, $34

    So for a run date of 07/31/2003 I want SQL to return:
    Stock1, 100, 07/31/2003, $18
    Stock1, 550, 07/31/2003, $18
    Stock2, 300, 06/30/2003, $34

    I can get the following to work for a single investment record, but cannot make the leap for each investment;

    SELECT i.ID, p.price max(p.date) FROM Investments i, Prices p WHERE i.id = p.id AND p.date <= '07/31/2003

    Any suggestions? Thanks.

  2. #2
    Join Date
    Dec 2003
    Location
    Toronto
    Posts
    15
    You can try this:

    select
    i.ID, i.shares, x.maxDate, p.Price
    from Investments i
    inner join
    (
    select
    ID = ix.ID,
    maxDate = max(px.Date)
    from Investments ix
    inner join Prices px on px.InvestID = ix.ID
    group by ix.ID
    ) x
    on x.ID = i.ID
    inner join Prices p on p.InvestID = i.ID and p.Date = x.maxDate

  3. #3
    Join Date
    Sep 2003
    Location
    The extremely Royal borough of Kensington, London
    Posts
    778
    select p.id, i.shares, p.date, p.price
    from prices p INNER JOIN investments i ON p.id = i.id
    where p.date = (select max(date) from prices where p.id = id);
    Last edited by r123456; 12-18-03 at 22:59.
    Bessie Braddock: Winston, you are drunk!
    Churchill: And Madam, you are ugly. And tomorrow, I'll be sober, and you will still be ugly.

  4. #4
    Join Date
    Dec 2003
    Location
    Toronto
    Posts
    15
    Hi r123456,

    It seems that both of us are wrong.
    But your solution is more elegant
    Both my sql statement and yours are returning only a subset of the values required by the problem.

    I've tried to reformulate the problem using two tables in Northwind, Customers and Orders.

    Here is a "translation" of your statement:

    select p.CustomerID, i.CompanyName, p.OrderDate, p.Freight
    from Orders p INNER JOIN Customers i ON p.CustomerID = i.CustomerID
    where p.OrderDate = (select max(OrderDate) from Orders where p.CustomerID = CustomerID);

    Now, Orders contains 830 records, while Customers has 91 records.
    The statement returns 90 rows, which I think is not we are expecting.
    I'm understanding that we'd like to have 830 records...

    But here is a small correction and everything is fine:
    select p.CustomerID, i.CompanyName, p.OrderDate, p.Freight
    from Orders p INNER JOIN Customers i ON p.CustomerID = i.CustomerID
    where p.OrderDate <= (select max(OrderDate) from Orders where p.CustomerID = CustomerID);



    I've changed my own script as follows:

    select
    i.CustomerID, p.OrderID, x.maxOrderDate, p.Freight
    from Orders p
    inner join Customers i
    on i.CustomerID = p.CustomerID
    inner join
    (
    select
    CustomerID = ix.CustomerID,
    maxOrderDate = max(px.OrderDate)
    from Customers ix
    inner join Orders px on px.CustomerID = ix.CustomerID
    group by ix.CustomerID
    ) x
    on x.CustomerID = i.CustomerID
    and x.maxOrderDate >= p.orderDate


    This statement returns also 830 rows, so I think it's ok.

    I've runned both solutions on two tables from other database, one with aprox 100k records (the correspondent of Investments) and the other one having 190 k records.

    Your solution is about 30% faster, which - to be honest with you - make no sense to me....

  5. #5
    Join Date
    Sep 2003
    Location
    The extremely Royal borough of Kensington, London
    Posts
    778
    Hi,

    The corrected query,

    select i.id, i.shares, p.st_date
    from prices p INNER JOIN investment i ON i.id = p.id
    where st_date = (select max(st_date) from prices where p.id = id AND st_date <= '07/31/2003);

    If I was to replace the original query with <= then every row would be returned upto that date, which is incorrect.

    The following should clear this issue.
    I want to retrieve the most recent price for each investment
    Although the term 'most recent' is not strictly defined I have interpreted it as meaning the maximum value equal to or less than the specified 'run date'.

    Cheers
    Last edited by r123456; 12-19-03 at 02:56.
    Bessie Braddock: Winston, you are drunk!
    Churchill: And Madam, you are ugly. And tomorrow, I'll be sober, and you will still be ugly.

  6. #6
    Join Date
    Dec 2003
    Posts
    3
    Thanks for all the suggestions. Most recent price means exactly that. The max price <= to the run date.

    To clarify the problem. Assume we have 10 investments and we have monthly prices for the past 5 years (thus 60 prices for each ID). For a given run date the challenge is to return the invest.ID with the price closest or equal to the run date. ID's with no price should also be returned.

    Therefore, we would return a list of the 10 investments and the closest price to the run date. Linking from investments to prices seem to make more sense to me than vice-versa.

    Additionally, I should have noted earlier that this code is used against MSSQL and another db - the glitch being that the ODBC driver for the second db can only use WHERE table1.key = table2.key for inner joins. It doesn't support the INNER JOIN keywords.

    And there may or may not be a price in the price file for each investment ID. Can the query be written without using the INNER JOIN syntax?

    Thanks again.
    Last edited by bhenness; 12-19-03 at 10:45.

  7. #7
    Join Date
    Sep 2003
    Location
    The extremely Royal borough of Kensington, London
    Posts
    778
    ID's with no price should also be returned.
    This depends on your table structure. I assume this is represented with nulls.

    select i.id, i.shares, p.st_date
    from prices p, investment i
    where p.id = i.id AND
    (st_date = (select max(st_date) from prices where p.id = id AND st_date <= '07/31/03') OR st_date is null);

    Multiple occurences of stock_id will be returned as a single stock_id may have many instances of 'shares'. To return only one instance per stock_id, alter the join condition to so that it joins on the primary / foreign key.
    Last edited by r123456; 12-19-03 at 20:58.
    Bessie Braddock: Winston, you are drunk!
    Churchill: And Madam, you are ugly. And tomorrow, I'll be sober, and you will still be ugly.

  8. #8
    Join Date
    Dec 2003
    Posts
    3

    What I ended up with...

    Thanks for the help. This statement works but is so slow it is virtually non-useable.

    "select distinct i.id, p.price, p.price_date from investments i, pricing p where c.id = p.id AND and p.price_date = " (select max(p2.price_date) from pricing p2 where p2.id = p.id AND p2.price_date <='2001/03/31')"

    Given that it has to do a select for each main record is what probably makes it so slow.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •