Results 1 to 4 of 4
  1. #1
    Join Date
    Oct 2014
    Posts
    2

    Lightbulb Unanswered: Question on "load data into a MS access relational Database"

    I'm fairly new to MS Access and recently working on what my boss call "load data into a MS access relational Database"

    To do the job, I have build the empty tables (6 tables) with primary and foreign keys, linked the table with relationships and enforced the referential integrity.

    Previously I loaded the data into Access table first, then created the relationships, after some manual data cleaning process I was able to link the tables with referential integrity. But now I create the relationship on empty tables first, and then try to "load" the data into these tables at once. So my question is if the latter is doable. If Yes, how shall I do it; If No, what shall I do to "load data into a MS access relational Database"

    Note: I have all the data files ready, but it has inconsistencies and redundancies. Cleaning work will be needed before I can use the data. Please let me know what tools (like VBA, Access SQL) are recommended for the cleaning work.

    Any suggestion or comments will be highly appreciated! Thank you everybody!

    Dimitri

  2. #2
    Join Date
    Mar 2009
    Posts
    5,442
    Provided Answers: 14
    First, I'll assert that you already know how to import data from the data files, as you do not mention anything about that process.

    You could use a working schema such as:

    1. Select the order in which the data files must be processed. As you defined relationships, data from a master (main) table must be imported before data from a related (child or secondary) table.

    2. Import data into a temporary file.

    3. Identify the rows that are already present in the database and those that are not. You can use the query builder wizards to have an idea on how to perform that: one wizard allows you to build a query to identify matching rows in two tables and one builds a query to identify non-matching rows in two tables.

    4. Decide what to do with the duplicate rows, replace the old values with the new values, create duplicate rows (with a new PK) or ignore these rows.

    5. Build a query to insert the new (non-matching) rows in the final destination table, then execute it.

    6. Build a query to handle the duplicate rows according to the decisions you made at step 4, then execute it.

    7. Repeat the process with the data to be imported to the related-secondary-child tables.
    Have a nice day!

  3. #3
    Join Date
    Oct 2014
    Posts
    2
    Quote Originally Posted by Sinndho View Post
    First, I'll assert that you already know how to import data from the data files, as you do not mention anything about that process.

    You could use a working schema such as:

    1. Select the order in which the data files must be processed. As you defined relationships, data from a master (main) table must be imported before data from a related (child or secondary) table.

    2. Import data into a temporary file.

    3. Identify the rows that are already present in the database and those that are not. You can use the query builder wizards to have an idea on how to perform that: one wizard allows you to build a query to identify matching rows in two tables and one builds a query to identify non-matching rows in two tables.

    4. Decide what to do with the duplicate rows, replace the old values with the new values, create duplicate rows (with a new PK) or ignore these rows.

    5. Build a query to insert the new (non-matching) rows in the final destination table, then execute it.

    6. Build a query to handle the duplicate rows according to the decisions you made at step 4, then execute it.

    7. Repeat the process with the data to be imported to the related-secondary-child tables.

    Hi Sinndho, thank you for the detailed instructions, your guideline does make a lot of sense. I added more detailed cleaning procedures and it have worked out. Yeaks! I really appreciated it.

    Here is a follow up question on displaying the records that cannot be appended into my tables, by that I mean if access can output the x, y, z records from the error message as below:

    Microsoft Access set 0 field(s) to Null due to a type conversion failure, and it didn't add x records to the table due to key violations, y records due to lock violations, and z records due to validation rule violations.

    Since this data loading is a practice on daily bases, I'm using VBA and DAO to automate this process. When I use a append query to add records to the existing table, often time many records are rejected due to various reasons. It would be great for me to see all these "rejected records" automatically instead of going back and look at my original data and trying to figure out why they cannot be appended.

    Please let me know if you guys have any thoughts on this. Your advice will be greatly appreciate!

  4. #4
    Join Date
    Mar 2009
    Posts
    5,442
    Provided Answers: 14
    Depending on how you import the data files, I can imagine several solutions to the problem you describe. Here's one:

    1) Define the temporary tables (i.e. the ones that first receive the imported data) with all columns defined as of Text data type (Allow Null, Allow zero-length string). Such a table should be able to accept any type of data, except perhaps very long strings where a Memo data type would be required.

    2) As you're familiar with VBA, use a pair of recordsets, one open on a temporaty table and one open on a final destination table. Read each row of the first recordset and loop through its Fields() collection to check whether the value in the field can be converted to a valid value as defined in the second recordset.

    3) If the test in 2 here above is successful, insert a new row in the second recordset and assing the values from the first recordset, using conversion function if necessary (don't forget that all fields in the first recordset are of Text datatype). When the insert is successful, issue a Delete command for the current row of the first recordset, then move to next row. When you're done looping through the first recordset, only the "offending" rows (i.e. those containing improper data) will remain in the temporary table.

    4) To ensure the integrity of the process, you can use transactions (either the insert and the delete are successful and the transaction is commited, or something goes wrong in the process ans you issue a rollback command). For being able to do so, you cannot open the recordsets on the CurrentDb object. You'll need to use the DBEngine.Workspaces(0).Databases(0) syntax because the methods related to transactions (BeginTrans, CommitTrans, RollBack) are members of a WorkSpace object in DAO (see: http://msdn.microsoft.com/en-us/libr...fice.12).aspx).

    If the data are imported from csv files, you can read each line, use the Split function to transform the read line into an array, then use this array to perform the validation process described in 2 here above. If the test is successfull, the data are inserted in the destination table, if it's not, the data are inserted in an import errors table in which all columns are defined as of Text (or Memo: see above) data type.

    There are other solution involving heavy use of the error trapping mechanisms of Access but personally I tend not to use such solutions when I can (to me an error is something that you cannot foresee and prevent).
    Have a nice day!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •