Results 1 to 14 of 14
  1. #1
    Join Date
    Jun 2011
    Posts
    6

    Unanswered: Access VBA - Updating fields causing database bloat?

    I'm having problem with my database becoming bloated. (I'm using Access 2007 under Windows XP.)

    I suspect that editing data fields within a table is causing my datbase to become bloated. Here's the full story - everything is being done with VBA code.

    I have an empty table which I append records to by means of an INSERT query. I really only add skeleton information, enough to have a unique identifier - Customer ID & Transaction ID. All the other fields are empty

    Once this is done, I loop through all the records in this target table. I pick up information from another table and update individual fields in the target table based on what I find in the other table. Altogether there are around 70,000 records in the target table and perhaps 20 fields in each record that are updated.
    This seems to cause the database to bloat in size - and in fact it exceeds the 2GB limit long before it's updated all 70,000 records. Compacting the database shrinks it back down to a much more reasonable size.

    Why would editing individual fields cause the database size to bloat so much? Is it because Access is storing the previous values so the updates could be undone? If so, is there anyway to prevent this?

    Of course I'm by no means certain that it is the editing of the fields that is causing this bloat but I have done some testing and this seems to be the cause. I tried populating the fields in the first step (with the INSERT query), and then only editing the ones that needed changes. This produces the same end result but requires far less editing - and the bloat is much smaller, although it's still quite big.

    I'd love to hear any suggestions as to what is causing this and how I could avoid it.

  2. #2
    Join Date
    Jun 2005
    Location
    Richmond, Virginia USA
    Posts
    2,763
    Provided Answers: 19
    Quote Originally Posted by HedgePig View Post
    Once this is done, I loop through all the records in this target table. I pick up information from another table and update individual fields in the target table based on what I find in the other table.
    How, exactly, are you doing this looping and how are you picking up other info from another table?

    Linq ;0)>
    Hope this helps!

    The problem with making anything foolproof...is that fools are so darn ingenious!

    All posts/responses based on Access 2003/2007

  3. #3
    Join Date
    Oct 2009
    Posts
    340
    you post: "...have an empty table which I append records to by means of an INSERT query"

    does that mean you are using a table as a temp scratch pad?...if so then even though you delete out those records at some point before reuse - the memory is not released I believe....

    you can use the compact-on-close option. you might try using UPDATE query rather than clear & append cycle - - not sure, but it might work by not continuously adding more records for memory.

    of course also as a db designer I would question as to whether looping thru 70k records could not be more efficiently approached via action queries - -
    www CahabaData com

  4. #4
    Join Date
    Dec 2010
    Location
    Manchester
    Posts
    24
    there is a few things you could be doing wrong, but wont know for definite unless we see the code!

    any chance you can post it?
    Would be very beneficial in helping to solve your problem

  5. #5
    Join Date
    Jun 2005
    Location
    Richmond, Virginia USA
    Posts
    2,763
    Provided Answers: 19
    Quote Originally Posted by NTC View Post
    you can use the compact-on-close option.
    The Compact-on-close option is quite possibly the worst feature the Access Gnomes have ever come up with! Compacting is known to frequently cause data corruption/loss and should never be done without first making a backup copy of the database.

    Linq ;0)>
    Hope this helps!

    The problem with making anything foolproof...is that fools are so darn ingenious!

    All posts/responses based on Access 2003/2007

  6. #6
    Join Date
    Jun 2011
    Posts
    6
    Thanks very much for the help, folks! I'm both revising my thoughts as to what is causing the problem as well as posting some sample code. After further testing, I now suspect that it has something to do with collection objects not freeing up space when I've finished with them, rather than the editing of the records. However, I can't see what I'm doing wrong.

    The code is a simplifcation of the "real code" - but running it produces eactly the same problem. If I start with a 10Mb database, running the code once causes it to expand to 50Mb - and compacting it shrinks it back down to 10Mb.


    There are 2 tables in the database:
    1) A source table with Customer ID, Transaction ID and 6 further fields (which contain random data for testing purposes.) Each customer ID can have several transaction ID's.
    2) A Target table which contains Customer ID and the same 6 further fields as occur in the source table.

    The aim of the code is to take source table, create a list of unique customer ID's in the target table.
    Once this is done, the code loops through the source data and updates each of fields the target table with the value that occurs most often for that field in the source table (for a particular customer ID.) In the "real" app, the rules for updating each field may be more complicated.
    To track how many times a value occurs in a particular field, I'm using a collection. For each distinct value, an instance 'clnItemClass' is added to the collection. clnItemClass contains the vakue and counter for how many times that value occurs. Once I have finished with all the transaction records for the current customer, I update each field and then clear the collections (I think!) and repeat the process for the next customer


    Any suggestions would be greatly appreciated!

    Regards
    HedgePig

    Code:
    'Creates a Unique list of Customer ID's (selected from the source table) and appends them to the target table
    'Then loops through the source table and updates the remaining fields in the target table. For a given customer ID, each field is updated with whatever value occurs the most often in for that customer ID in the source table
    
    Sub MainRoutine()
    
    Dim TgtRS As Recordset  'Contains source data - each customer ID can have many transactions
    Dim SrcRS As Recordset  'Contains the "summarised" output - each customer ID has only one record
    
       CurrentDb.Execute "DELETE * From TargetTable;"
       'Insert just the customer ID's into the target database.
       'Later on populate the other fields
       CurrentDb.Execute "INSERT INTO TargetTable (CustomerID) " & _
                         "SELECT SourceTable.CustomerID " & _
                         "FROM SourceTable GROUP BY SourceTable.CustomerID;"
      
      Set TgtRS = CurrentDb.OpenRecordset("SELECT CustomerID, Field3, Field4, Field5, Field6, Field7, Field8 FROM TargetTable ORDER BY CustomerID")
      
      'Now populate the other fields
      Set SrcRS = CurrentDb.OpenRecordset("SELECT CustomerID, TransactionID, Field3, Field4, Field5, Field6, Field7, Field8 FROM SourceTable ORDER BY CustomerID, TransactionID")
      Call UpdateRecords(TgtRS, SrcRS)
    
       Set TgtRS = Nothing
       Set SrcRS = Nothing
    
    End Sub
    
    Sub UpdateRecords(TgtRS As Recordset, SrcRS As Recordset)
    Dim F3cln As New Collection
    Dim F4cln As New Collection
    Dim F5cln As New Collection
    Dim F6cln As New Collection
    Dim F7cln As New Collection
    Dim F8cln As New Collection
    
       Do While Not SrcRS.EOF
          Do While TgtRS!CustomerID = SrcRS!CustomerID
             
             Call UpdateCln(F3cln, SrcRS!Field3)
             Call UpdateCln(F4cln, SrcRS!Field4)
             Call UpdateCln(F5cln, SrcRS!Field5)
             Call UpdateCln(F6cln, SrcRS!Field6)
             Call UpdateCln(F7cln, SrcRS!Field7)
             Call UpdateCln(F8cln, SrcRS!Field8)
             
             SrcRS.MoveNext
             If SrcRS.EOF Then Exit Do
          Loop
          Call UpdateField(TgtRS, F3cln, TgtRS!Field3)
          Call UpdateField(TgtRS, F4cln, TgtRS!Field4)
          Call UpdateField(TgtRS, F5cln, TgtRS!Field5)
          Call UpdateField(TgtRS, F6cln, TgtRS!Field6)
          Call UpdateField(TgtRS, F7cln, TgtRS!Field7)
          Call UpdateField(TgtRS, F8cln, TgtRS!Field8)
                
          Call EmptyCln(F3cln)
          Call EmptyCln(F4cln)
          Call EmptyCln(F5cln)
          Call EmptyCln(F6cln)
          Call EmptyCln(F7cln)
          Call EmptyCln(F8cln)
          
          TgtRS.MoveNext
          If TgtRS.EOF Then Exit Do
       Loop
       
       
       
    End Sub
    
    
    'Increases count for the occurence of the specific itemval.
    'If it hasn't occured before, error is generated but the error trapping adds it to the collection
    Public Sub UpdateCln(cln As Collection, itemval As Variant)
    'Static citem As clnitemClass
    Static cstritem As String
    
    cstritem = CStr(itemval)
       On Error GoTo missingitemerr
       cln(cstritem).icount = cln(cstritem).icount + 1
    
    Exit Sub
    
    missingitemerr:
       'citem.itemval = itemval
       cln.Add New clnItemClass, CStr(itemval)
       cln(cstritem).itemval = itemval
       On Error GoTo 0
    Resume
    End Sub
    
    'Picks the value which occurs the most often and updates the field with that value
    Public Sub UpdateField(RS As Recordset, cln As Collection, fld As Field, Optional minchanges As Integer = 1)
    Dim modval As Variant
       
       If cln.Count >= minchanges Then
          modval = FindMaxO***al(cln)
          RS.Edit
          fld = modval
          RS.Update
       End If
    
    End Sub
    
    'Returns the field value that occurs the most often in a collection (for ties on occurence, will return most recent field)
    Function FindMaxO***al(cln As Collection) As Variant
    Static i As Integer
    Static MaxOcc As Integer    'highest number of occurences
    Static MaxOccPos As Integer 'position in collection where the maximum occurence occurs
       MaxOcc = -1
       MaxOccPos = -1
       For i = 1 To cln.Count
          If cln(i).icount >= MaxOcc Then
             MaxOcc = cln(i).icount
             MaxOccPos = i
          End If
       Next
       FindMaxO***al = cln(MaxOccPos).itemval
    
    End Function
    Sub EmptyCln(cln As Collection)
       Do While cln.Count > 0
          'Set cln(1) = Nothing
          cln.Remove (1)
       Loop
       
    End Sub

    Code:
    'Code for the clnItemClass class
    Option Compare Database
    
    Public icount As Long
    Public itemval As Variant
    Last edited by HedgePig; 06-04-11 at 06:48.

  7. #7
    Join Date
    Jul 2004
    Location
    South Dakota
    Posts
    267
    One thing I noticed when quickly scanning your code is that you never release any of the collection objects that you create with New keyword in the UpdateRecords sub.

    Set F3cln = Nothing
    ...
    ...
    ...
    Set F8cln = Nothing

    C

  8. #8
    Join Date
    Jun 2011
    Posts
    6
    Hello canopus

    Thanks for the suggestion - but it makes no difference! I had in fact tried that within the EmptyCln routine.

    I wonder if it's to do with objects I'm adding to the collection?
    cln.Add New clnItemClass, CStr(itemval)

    While I delete the items in the collection, I am not actually setting them to be equal to nothing. It seems I can't do this directly anyhow as
    Set cln(1) = Nothing
    generates an error message

    I must admit I'm still baffled by what is causing this. Most frustrating!

    Regards
    HedgePig

  9. #9
    Join Date
    Jul 2004
    Location
    South Dakota
    Posts
    267
    I think you might be on to something with the line:

    cln.Add New clnItemClass, CStr(itemval)

    Every time that it is called it creates another instance of the calls clnItemClass. No where are you destroying that object. You can kill the collection cln but nowhere are you destroying clnItemClass that you created.

    C

  10. #10
    Join Date
    Jun 2011
    Posts
    6
    Hmmm.....I take back what I said about the collections being the cause of the problem. At least for now.

    I find that if I leave everything unchanged except that I comment out the 3 lines that actually update the fields in the UpdateFields routine
    RS.Edit
    fld = modval
    RS.Update
    then, I don't get any code bloat at all!

    [Update] In fact, it seems to be just the fld = modval line that is the problem.
    Last edited by HedgePig; 06-04-11 at 11:43.

  11. #11
    Join Date
    Jun 2011
    Posts
    6
    I've tried another simple experiment. This time I created a database with a single table that has the 5 text fields, Field1, Field2, Field3, Field4 and Field5.

    In my test run I had just under 10,000 records in this table. The database was around 1.3 Mb in size.

    Running the following code _sometimes_ causes the database to bloat enormously - to around 38 Mb. Compacting the database restored it to its former slim self.

    I am (still) completely baffled as to what is happening and why!

    Code:
    Sub Main()
    
    Dim cln As New Collection
    Dim RS As Recordset
    Dim i As Long
    
       Set RS = CurrentDb.OpenRecordset("SELECT Field1, Field2, Field3, Field4, Field5 from Table1;")
       RS.MoveFirst
       For i = 1 To 10
          Do While Not RS.EOF
             RS.Edit
                RS!field1 = i & " ABCDEFGHIJKLMNOPQRSTUVWXYZ " & Chr(65 + Rnd() * 26)
                RS!Field2 = i & " HORRIBLYBLOATEDDATABASE " & Chr(65 + Rnd() * 26)
                RS!Field3 = i & " Field 4 " & Chr(65 + Rnd() * 26)
                RS!Field4 = i & " Field 5 " & Chr(65 + Rnd() * 26)
                RS!Field5 = i & " Field 6 " & Chr(65 + Rnd() * 26)
             RS.Update
             RS.MoveNext
          Loop
          RS.MoveFirst
       Next
       RS.Close
    End Sub

  12. #12
    Join Date
    Jul 2004
    Location
    South Dakota
    Posts
    267
    OK. Now I'm really wondering what is going on. I too have tried something.

    I created a table with 6 fields (I added an ID field that is unique). I loaded each field with the number of the record up to 10,000. So all fields for record 1 contain 1 and record 2 all contain 2 and so on.

    I ran your code and got the same results. Then I re-created the table like I initially did. Then I modified the code. Instead of using DAO method of update I moved the DAO code into an SQL statement.

    Code:
    strSQL = "Update Table1 Set Field1 = '" & i & " ABCDEFGHIJKLMNOPQRSTUVWXYZ " & Chr(65 + Rnd() * 26) & "',   Field2 = '" & _
                i & " HORRIBLYBLOATEDDATABASE " & Chr(65 + Rnd() * 26) & "',   Field3 = '" & _
                i & " Field 4 " & Chr(65 + Rnd() * 26) & "',   Field4 = '" & _
                i & " Field 5 " & Chr(65 + Rnd() * 26) & "',   Field5= '" & _
                i & " Field 6 " & Chr(65 + Rnd() * 26) & "' Where ID = " & RS("ID")
                
                CurrentDb.Execute strSQL
    Then I re-ran the code. The table was 0.796875 mb prior to running the code. After the code it was 1.61328125 mb. Much better than the 78 mb I was getting.

    So it seems that some of the issue is coming from DAO itself. If you can move the updating into an SQL statement it looks like it will reduce bloat but may increase the run time of the code.

    C

  13. #13
    Join Date
    Jun 2011
    Posts
    6
    Great! Thanks Canopus! Your suggestion of directly using a SQL update statement works for me as well. In a quick, rough test the difference in speed doesn't look to be a major issue.

    By they way. I posted this problem on another forum and one comment was that Access may be keeping the prior values in memory to allow a rollback and not freeing these up, which ahd crossed my mind. I did try using a BEGINTRANS and COMMIT but this didn't make a diffrence - however , I could well have not been using them correctly (although no error was generated.)

    In any case - I'm 99% sure your suggestion will solve my problem which is a huge relief as I had no idea what was going wrong. The "real life" database where I first encountered the problem was exploding to over the 2GB limit because I had several hundred thousand updates.

    Once again, many thanks!

    Regards
    HedgePig

  14. #14
    Join Date
    Jul 2015
    Posts
    1
    Sorry for the necro, but I ran into this same issue. I think it may have to do with the page size, because the bloat does not seem to occur if the field being updated is moved to be in the second position (Right after the PK). So, if moving the columns is realistic in your situation, that may help.

    Moving from DAO to ADO also seems to prevent the bloat without any modifications to columns necessary. Be prepared for this to run much slower than DAO, at least on local tables.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •