Page 1 of 3 123 LastLast
Results 1 to 15 of 38
  1. #1
    Join Date
    May 2006
    Posts
    35

    Unanswered: Should Large Lookup Tables be Stored in Access?

    Ok, here is my problem:


    I have hourly lookup tables for a specific aspect of the operation of many (around 100) devices.

    A program I am creating extracts a portion of the lookup table for each device selected and then parses each hourly data element sequentially.

    How should I store the lookup tables in accesss?

    My initial idea was to use the following fields:
    Device ID | Date(hour) | Value

    This would mean having around 8760*100 records in the table, and to extract the data for individual devices, I would have to perform around 100 individual queries on this huge dataset in total to pull the records into arrays.

    Is there a better method? (I currently only have access to MS Access and flat files, and am current programming in VBA (ADO) ).

  2. #2
    Join Date
    Nov 2004
    Location
    out on a limb
    Posts
    13,692
    Provided Answers: 59
    nothing wrong with using lookup tables within Access. Providing of course you refer to another table and don't use the 'lookup' wizard in the table design.

    volume could be an issue, access gets stroppy soemwhere between 1GB & 2 GB of data and many users. so it isn't neccessarily the best tool for storing masses of data. other candidates could be to change the data storage mechanism for a server based SQl engine (MSDE, MySQL & MS SQL server come to mind, but there will be other conternders too).
    I'd rather be riding on the Tiger 800 or the Norton

  3. #3
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    Quote Originally Posted by Nai
    .....A program I am creating extracts a portion of the lookup table for each device selected and then parses each hourly data element sequentially.
    ....
    to extract the data for individual devices, I would have to perform around 100 individual queries on this huge dataset in total to pull the records into arrays.
    Hi

    Depends on what you are doing with the data. Fancy running through what you do perhaps posting code? There might be a huge bottleneck here if you aren't writing really efficient code.
    Testimonial:
    pootle flump
    ur codings are working excelent.

  4. #4
    Join Date
    May 2006
    Posts
    35
    I am still in the progress of writing code and it won't be very useful to post what I have, but I'll explain the current structure of my program

    Currently, it has the following inputs:
    Tables:
    Existing Devices (Device ID, Name, Max power output, typical power output, etc.)
    New Devices (same as above)

    Spreadsheets:
    Existing Device Operating Profiles (Hours x # of Devices with dates on the leftmost column and Device IDs across the first row)
    New Device Operating Profiles (same as above)

    What my program does is take the union of Existing and New devices from their two respective tables and use their fields to populate a user-defined class array (with a variable for the "current power" (clsDevice(i).curPower) that is initialized to the "typical power" number stored in the table).

    For every timepoint that the program runs (each hour), the current power variables in clsDevice() are updated according to the associated value in the spreadsheets. However, not all devices have profiles, and if this is the case, their current power remains the same as its initialized value)
    Then, a separate array arrStack() sorts indices (i) of clsDevice by the current power of clsDevice(i).curPower .

    My problem is that I am trying to get away from spreadsheet usage if possible , partially due to the size of the data (the number of hours in the profiles can be greater than 40,000 if five-years worth is required).
    Another problem is matching up the profiles with their device in clsDevice(), but my plan is to read the profiles into memory (arrProfiles(profileID,hours)), and then to create a new variable clsDevice(i).profileID to match each device to its profile in arrProfiles.

    wow... that is a lot to take in, but I hope someone can help.

  5. #5
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    Hi Nai

    I am with you most of the way here. Sounds like a good idea to get away from spreadsheets. You obviously know at least something about normalisation since you are normalising your spreadsheet.

    I am always a little bit suspicious of two tables that are identical but are used to store information about "different" entities. Could the new\ existing distinction actually be considered an attribute of a device? For certain - a union of the tables rather than having one large (in terms of rows) table will be less efficient to work with.

    Also - I'm not clear what the class actually does. I can see what it stores but not what its purpose is. Can you help me out on that one?

    HTH
    Testimonial:
    pootle flump
    ur codings are working excelent.

  6. #6
    Join Date
    May 2006
    Posts
    35
    A1: The reason I have two tables is because the existing devices do not change, but the new devices do change based on the scenario I am running. Hence, users can dump a table of new devices for each scenario rather than having to edit the original table.

    A2: I am essentially using a class array to keep all the data straight:
    (Device ID, name, on/off status, max power, cur power)
    During the operation of the program, certain devices can be turned on or off and the current power that I use in my calculations is constant for some devices, and changes according to the profiles for other devices.

    What the program does is loop through each hour for a time period, check the status of the device (on/off) for each hour, and if it is on, perform a calculation with its current power.

    Should I be using an alternative to a class?

  7. #7
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    Hi

    I think I would still have one table and delete\ insert as required.

    However - I think more important is to see if you can improve the way you interact with your data. As far as I can see, your class array is a way to get the table contents into memory so you can loop through, read them and perform calculations? If so - a set based approach is virtually always more efficient (the main exception is if you are referring to other e.g. "previous" rows).

    Basically - if you aren't using the class to do aything really exotic then I would keep the data straight by leaving it in the table.

    HTH
    Testimonial:
    pootle flump
    ur codings are working excelent.

  8. #8
    Join Date
    May 2006
    Posts
    35
    So if I understand you correctly, I shouldn't bring the data into memory as arrays at all?

    Hmm... I may be able to manage that.

    My only problem is that some devices have hourly profiles while others only have one power output value. Additionally, in my calculations, I am grouping the devices by zone (one of the variables of the class) and sorting them into a stack (by either current power output, or another variable called "cost").

    Does this mean that I should first create some sort of complex query that generates an hourly profile for each device (combining the current profiles, and outage schedules) and sort it by hour and cost (or current power output), and secondly query this table each hour to get the output for every device?

    This process seems a bit slow when my simulations usually last 8760 hours ....

  9. #9
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    Well - difficult to be sure of the details having not seen the code etc but broadly yes.

    The gist is that SQL works on sets of data. It is rarely slower than looping through large cursors. Having said that the limitations of JET can also be exposed. How about posting some code or a sample db with a wee bit of data?
    Testimonial:
    pootle flump
    ur codings are working excelent.

  10. #10
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    Sorry - forgot you don't have any code. Are you comfy with SQL?
    Testimonial:
    pootle flump
    ur codings are working excelent.

  11. #11
    Join Date
    May 2006
    Posts
    35
    I'm in the process of learning SQL :P

    That being said, currently I'm having some fun with IIF(expr, true,false) statements to combine an outage schedule with the current device output. By adding a table with the date/times, this is repeated for the selected period.

    I will send over an zipped .mdb after I change some of the data (this is a side-project @ my lab).

    Thanks

  12. #12
    Join Date
    May 2006
    Posts
    35
    Ok...

    Here is one of my queries.
    What is does is take date/times (sequential) from tblODF.Dt, IDs from GID, and the maximum power from Installed_Capacity. It then joins the Existing table with the Outage table and in a separate field inserts the value of Installed_Capacity for each GID if the current data is outside the bounds specified in the Outage table for that GID.

    However, it takes a long time to run for 8760 hours.

    This is what I was trying to do:
    Code:
    SELECT tblODF.Dt, tblExisting.GID, tblExisting.Installed_Capacity, tblOutages.StartDate, IIf(tblODF!Dt>=tblOutages!StartDate And tblODF!Dt<tblOutages!EndDate,0,tblExisting!Installed_Capacity) AS Expr1
    FROM tblODF, tblExisting INNER JOIN tblOutages ON tblExisting.GID = tblOutages.GID
    GROUP BY tblODF.Dt, tblExisting.Plant, tblExisting.Installed_Capacity, tblOutages.StartDate, IIf(tblODF!Dt>=tblOutages!StartDate And tblODF!Dt<tblOutages!EndDate,0,tblExisting!Installed_Capacity), tblExisting.Cost
    HAVING (((tblODF.Dt)>=#1/1/2007# And (tblODF.Dt)<#1/1/2008#))
    ORDER BY tblODF.Dt, tblExisting.Cost;


    My next step would be to use the hourly profiles for each GID instead of the Installed Capacity (Maximum Power).

  13. #13
    Join Date
    May 2006
    Posts
    35
    I am uploading a sample subset of my data.

    On a sidenote, when I delete a large number of records, why is it that the size of the .mdb does not change, but when I re-import those same tables into a new .mdb, the size is much smaller?

    Thanks for all of your help!
    Attached Files Attached Files

  14. #14
    Join Date
    Apr 2004
    Location
    outside the rim
    Posts
    1,011
    I'd be warry of using the immediate ifs in your queries - adding functions to queries is one of those pot holes that can turn into a cavern as the number of records increases.

    One way around (frowned on by many, but it is more efficient) is to make a public function where the first time it is called (the first record) it performs the iif and sets the value to a public variable and every time after that, it simply uses the public variable. Just remember to set the public variable to empty prior to running the query.

    In your case, it looks like you are checking values against each other on the fly.

    An immediate performance gain would be to combine the two functions - move the iif to a public function and reference the public function in both instances - half as much compiling for Access to do as it runs the query.

    A way to do this without the iif functions at all is to use two select queries, one including
    Code:
    ... 0 As Expr1 ... 
    ... WHERE tblODF.Dt >= tblOutages.StartDate And tblODF.Dt < tblOutages.EndDate ...
    and the other
    Code:
    ... tblExisting.Installed_Capacity As Expr1 ...
    ...WHERE tblODF.Dt < tblOutages.StartDate Or tblODF.Dt >= tblOutages.EndDate ...
    and then UNION those, and then run your query off the UNION query, using Expr1 as the GROUP BY instead of the iif function.

    Acheives the same result with no in line functions for Expr1. A little more tedious to assemble, but in large data sets it will be faster.

    good luck!
    tc

  15. #15
    Join Date
    Apr 2004
    Location
    outside the rim
    Posts
    1,011
    oh yeah, forgot to mention:

    It's probably fastest if you skip all the fancy query work and do it all in code using a couple recordsets and an array.

    You could use a temporary table instead of the results array to store and display the data (since Access doesn't allow unbound recordsets).

    tc

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •