Page 1 of 2 12 LastLast
Results 1 to 15 of 26
  1. #1
    Join Date
    Aug 2011
    Posts
    10

    Unanswered: Top X record per category

    I am trying to do what should be a simple query in MS Access 2003.

    I have a table that includes closing prices for the stocks of a number of companies, and I am trying to build a query to select the records for the last N days for each company. I could not get the required results by using the top N in query design, but I understand it should be easy in SQL, but I don't know SQL and tried different solutions on this board and could not get it to work.

    My table consists of the following fields: CompanySymbol, Date, Price, Volume. I need the query to return the records for the last 10 days (according to the Date) for each of the companies listed under CompanySymbol.

    I know, this might be a easy thing but I am new to Access and could not get it to work, so I appreciate the help.

    Thank you in advance
    Last edited by nhandal; 08-27-11 at 08:48.

  2. #2
    Join Date
    Nov 2004
    Location
    out on a limb
    Posts
    13,692
    Provided Answers: 59
    so where does the top bit come in
    from what you are saying you want the prices for the last 10 days, as opposed to the top (whatever).

    for the last 10 days the where clause should be something like

    where bargaindate between (datediff("dd",date(),10), date())

    if you open up the qeury designer and switch the to the SQL view you should be nearly tghere. the SQL view is usually the left most button under the menu. (3 options, run query, design or SQL)
    I'd rather be riding on the Tiger 800 or the Norton

  3. #3
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Quote Originally Posted by healdem View Post
    so where does the top bit come in
    top 10 rows (by date descending) per company

    one company's last 10 might be within dateadd("dd",-10,date()) [note: dateadd, not datediff] and date(), but not every company will meet this criterion
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  4. #4
    Join Date
    Aug 2011
    Posts
    10
    Thanks for your responses.

    I should have been more clear. Let us assume the table looks like this:

    CompanySymbol, Date, Price, Volume

    IBM, August 1 2011, 102, 1000000
    INTC, August 1 2011, 70, 1500000
    GOOG, August 1 2011, 520, 10000000
    IBM, August 4 2011, 103, 1100000
    INTC, August 4 2011, 71, 1400000
    GOOG, August 4 2011, 521, 11000000
    IBM, August 5 2011, 105, 1000000
    INTC, August 5 2011, 75, 1500000
    GOOG, August 5 2011, 550, 10000000
    IBM, August 6 2011, 103, 1000000
    INTC, August 6 2011, 73, 1500000
    GOOG, August 6 2011, 530, 10000000
    IBM, August 7 2011, 102, 1000000
    INTC, August 7 2011, 70, 1500000
    GOOG, August 7 2011, 520, 10000000
    IBM, August 9 2011, 102, 1000000
    INTC, August 9 2011, 70, 1500000
    GOOG, August 9 2011, 520, 10000000

    For simplicity, let us say I need the top 3 records by date grouped by company, which means the query should return the records dated August 9 and August 7 and August 6 for all the three companies; IBM, INTC, GOOG. In other words the query should sort the records by date descending, then return the top 3 dates for each company resulting in the 9 lines from the 18 lines above, as I list them below (note that the dates may have gaps between them as in the original table) :

    IBM, August 9 2011, 102, 1000000
    IBM, August 7 2011, 102, 1000000
    IBM, August 6 2011, 103, 1000000
    INTC, August 9 2011, 70, 1500000
    INTC, August 7 2011, 70, 1500000
    INTC, August 6 2011, 73, 1500000
    GOOG, August 9 2011, 520, 10000000
    GOOG, August 7 2011, 520, 10000000
    GOOG, August 6 2011, 530, 10000000



    Thanks in advance.
    Last edited by nhandal; 08-28-11 at 03:08.

  5. #5
    Join Date
    Mar 2009
    Posts
    5,442
    Provided Answers: 14
    Try:
    Code:
    SELECT *
    FROM Tbl_Sales
    WHERE ( SELECT COUNT(*) 
            FROM Tbl_Sales AS s
            WHERE s.CompanySymbol = Tbl_Sales.CompanySymbol AND 
                  s.date > Tbl_Sales.date
          ) <= 2
    ORDER BY CompanySymbol, date DESC;
    Have a nice day!

  6. #6
    Join Date
    Aug 2011
    Posts
    10
    Thank you for the response.

    I tried to implement what you suggested, when I run the query, the computer stops responding and nothing is returned.

    Also, I hope you can explain what you wrote. As I am new to the SQL, I could not understand your code.

  7. #7
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    let me try to explain it

    the FROM clause returns rows from the sales table, using the table name

    the WHERE clause filters the rows, such that if a row fails the WHERE condition(s), then that row is not included in the result set

    the WHERE condition says that for each row in Tbl_Sales, certain rows in the same table, this time using the table alias "a", are compared to that paricular Tbl_Sales row, and these certain rows are the ones with the same company (s.CompanySymbol = Tbl_Sales.CompanySymbol) and with a later date (s.date > Tbl_Sales.date)

    these rows (same company, later date) are then counted

    if the count is less than or equal to 2 (i.e. 0, 1, or 2), that means there are only 0, 1, or 2 rows that have a later date for the same company, as compared to the row under consideration, which means that only the rows in the top 3 for each company are included in the result set

    this is called the "top N per group" pattern
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  8. #8
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    p.s. and to make the query efficient, there should be an index on company plus date
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  9. #9
    Join Date
    Aug 2011
    Posts
    10
    Thanks this was helpful. Although I don't think I can write such complex queries at this stage, your explanation helps to understand the logic behind the code.

    Now back to the potential problem in the code. As I said, when I try to run the query the computer stops responding and I get nothing, then I will have to restart the computer. Why is this happening, do you see any error in the code.

    Thanks,

  10. #10
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Quote Originally Posted by nhandal View Post
    Why is this happening
    did you see my post about the index?
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  11. #11
    Join Date
    Mar 2009
    Posts
    5,442
    Provided Answers: 14
    This topic (Select the top N rows from each group) is related to numbering rows. In the example I provided the Count() function is used to provide this numbering. Unfortunately it also uses a Cartesian Product (a Join of the table on itself) that can lead to very long computation times if the table has many rows. In extreme situations the process can lock the computer by using 100% of the processor activity and/or consuming all the available memory. Indexing the concerned column should decrease the processing time, though.

    There are several techniques to avoid using a Cartesian Product, unfortunately very few of them are usable with Access, most work only with a database engine such as SQL Server, Oracle, MySQL, etc which have a more complete set of SQL instructions and functions.

    Another solution that can be used with Access consists in creating a UNION query assembling several queries with each of them selecting only one category (CompanySymbol in your case). Something like:
    Code:
    SELECT * FROM (
        SELECT TOP 3 * FROM Tbl_Sales WHERE CompanySymbol = 'IBM' ORDER BY Date DESC
        UNION
        SELECT TOP 3 * FROM Tbl_Sales WHERE CompanySymbol = 'GOOG' ORDER BY Date DESC
        UNION
    SELECT TOP 3 * FROM Tbl_Sales WHERE CompanySymbol = 'INTC' ORDER BY Date DESC
    ) ORDER BY CompanySymbol, Date.
    Unfortunately this suppose that all possible values for the category are known in advance.

    You could easily use this technique to dynamically build the query with a little VBA:
    Code:
    Function SelectTop3From_Tbl_Sales()
    
        Const c_SQL As String = "SELECT TOP 3 * FROM Tbl_Sales WHERE CompanySymbol = '@' ORDER BY Date DESC"
        
        Dim rst As DAO.Recordset
        Dim qdf As DAO.QueryDef
        Dim strSQL As String
        
        ' Gather all existing values for CompanySymbol.
        '
        strSQL = "SELECT DISTINCT CompanySymbol FROM Tbl_Sales;"
        Set rst = CurrentDb.OpenRecordset(strSQL, dbOpenSnapshot)
        
        ' Assemble the query.
        '
        strSQL = ""
        With rst
            Do Until .EOF
                If strSQL <> "" Then strSQL = strSQL & " UNION "
                strSQL = strSQL & Replace(c_SQL, "@", !CompanySymbol)
                .MoveNext
            Loop
            .Close
        End With
        Set rst = Nothing
        strSQL = "SELECT * FROM (" & strSQL & ") ORDER BY CompanySymbol, Date DESC;"
        
        ' Edit the existing query object and replace its SQL property by the SQL previously assembled, then open it.
        '
        Set qdf = CurrentDb.QueryDefs("Qry_Top_3_Sales")    ' There must be a query named "Qry_Top_3_Sales" in the database
                                                            ' (can be: "SELECT * FROM Tbl_Sales", only the name matters).
        qdf.SQL = strSQL
        DoCmd.OpenQuery "Qry_Top_3_Sales"
        Set qdf = Nothing
        
    End Function
    Have a nice day!

  12. #12
    Join Date
    Aug 2011
    Posts
    10
    Thank you very much Sinndho and r937

    I ran the query on a table with 2,000 records it does exactly what I need, it took about 20 seconds to complete and when I put an index on the Date and CompanySymbol fields it took about 7 seconds, that is a big difference.

    Unfortunately, when I ran it on my real table which is approximately 90,000 records, it went on for about 7 hours and it was only half way through based on what the status bar was showing.

    I still need to test the query with indexing and with the VBA code that Sinndho provided. I hope I can get better results. I hope one of them can run and return the results in a maximum period of 30-40 minutes.

    I will let you know later today about the results of further testing.

    Thanks.

  13. #13
    Join Date
    Mar 2009
    Posts
    5,442
    Provided Answers: 14
    You're welcome!

    Please let me know how it went. If it's still too slow with the UNION query solution built from the VBA function, I'll try to find a faster way to get the proper results.
    Have a nice day!

  14. #14
    Join Date
    Aug 2011
    Posts
    10
    Sinndho,

    How do I run this VBA function?

    Please excuse my ignorance as I am new to Access.

  15. #15
    Join Date
    Mar 2009
    Posts
    5,442
    Provided Answers: 14
    The easiest way consists in:

    1. Create a new Module (in the Database window select: Module, then New).

    2. A new window opens with possibly one of two lines of text, such as:
    Code:
    Option Compare Database
    Option Explicit
    Copy the code of the function and paste it in the VBA editor after what's already in there. The module now looks like:
    Code:
    Option Compare Database
    Option Explicit
    
    Function SelectTop3From_Tbl_Sales()
    
        Const c_SQL As String = "SELECT TOP 3 * FROM Tbl_Sales WHERE CompanySymbol = '@' ORDER BY Date DESC"
        
        Dim rst As DAO.Recordset
        Dim qdf As DAO.QueryDef
        Dim strSQL As String
        
        ' Gather all existing values for CompanySymbol.
        '
        strSQL = "SELECT DISTINCT CompanySymbol FROM Tbl_Sales;"
        Set rst = CurrentDb.OpenRecordset(strSQL, dbOpenSnapshot)
        
        ' Assemble the query.
        '
        strSQL = ""
        With rst
            Do Until .EOF
                If strSQL <> "" Then strSQL = strSQL & " UNION "
                strSQL = strSQL & Replace(c_SQL, "@", !CompanySymbol)
                .MoveNext
            Loop
            .Close
        End With
        Set rst = Nothing
        strSQL = "SELECT * FROM (" & strSQL & ") ORDER BY CompanySymbol, Date DESC;"
        
        ' Edit the existing query object and replace its SQL property by the SQL previously assembled, then open it.
        '
        Set qdf = CurrentDb.QueryDefs("Qry_Top_3_Sales")    ' There must be a query named "Qry_Top_3_Sales" in the database
                                                            ' (can be: "SELECT * FROM Tbl_Sales", only the name matters).
        qdf.SQL = strSQL
        DoCmd.OpenQuery "Qry_Top_3_Sales"
        Set qdf = Nothing
        
    End Function
    3. Close the Module. If you're asked if you want to save it, answer Yes and give any name you want (a meaningful name is better).

    4. Create a new Query. What's in it does not really matter. It can be:
    Code:
    SELECT * FROM Tbl_Sales
    5. Save this query under the name: Qry_Top_3_Sales (this is important).

    6. The easiest way to run the code is to call it from a Macro. Create a macro, in the Action column, select: RunCode and in the Function Name textbox type: SelectTop3From_Tbl_Sales(). Save the macro (you can name it SelectTop3From_Tbl_Sales).

    7. When you run the Macro, the function is executed and the query opens.

    8. You can also call it from anywhere in VBA.
    Have a nice day!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •