Results 1 to 14 of 14
  1. #1
    Join Date
    Apr 2004
    Location
    outside the rim
    Posts
    1,011

    Unanswered: Moving data btw Excel and Access in bulk

    Looking for some opinions and ideas (preferably about the question below).

    App extracts the information found in files (spreadsheets or CSV files) that are attached to email messages.

    There are several different sources for the files, and therefore, each source is a different structure. Since all the data goes into one table, I created a "cross reference" table that tells my extraction code which field index in the data file goes into which field index in the table.

    Problem is, my current extraction method runs line by line through Excel, and it is slow.

    Any thoughts on a faster way to grab the bulk of data from Excel, reorder the columns as defined in a seperate table and append the data in the main table?

    I thought of using quries to reorder the columns and linking to the excel file ad hoc, but the hitch here is: we add new sources and revise existing sources regularly. The nice thing about the "cross-reference" table is that i can go in one place, make the change and it drives everything else.

    Especially since we end up having to return a certain percentage of the information and it has to be assembled back into the original file format in the correct order, so I utilize the above process in reverse to reconstitute the data files from the database. This process is also driven off the "cross-reference" table.

    tc
    have fun!

    Todd S.
    click to email


    Independent Development Consultant
    Biz db & reports - SQL, Access, Crystal, Seradex and more
    Small, custom, unique programs - Office, VBA, .NET


    _________________________________________________
    Favorite message from Windows:
    There was an error displaying the previous error message

    Sadly, there was no error number to look up ...

  2. #2
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    Hi tc

    Would you be able to use your cross reference table (or something similar to it) to dynamically build your queries?
    Testimonial:
    pootle flump
    ur codings are working excelent.

  3. #3
    Join Date
    Feb 2004
    Location
    Chicago, IL
    Posts
    1,312
    It sounds like you and I are working on a similar project. I data in several formats (Excel, CSV, fixed field, etc.) and I am trying to automate the processing of the data. I am not sure how much help I can be because it sounds like we have similar approaches but different requirements.

    For each file type I process I have a file spec. I link to the file and then create a unique append query (on the fly) to append the data to a common table. I work with mailing lists so there are several common fields and several generic fields that I use for unique data. For example, I have Name, Address, city, etc. for my common field names and MiscText1, MiscText2, etc. for my generic fields. Once I get the file in the common table format I treat all files the same. That may or may not help you, but I also have a similar requirement of being able to report 'exactly' what we received for data.

    In order to ensure that the data is imported and not changed I add the whole record to the end of each record and then when I link to it I put it in a memo field. You have to do this differently if the file is a CSV file or fixed field. That adds a little complexity but not too bad. I only need to keep the data for a few records so I have a process of marking n records and then collecting the 'raw data'.

    As far as the formats changing, I don't have any advice for you I am planning to make changes as they are needed. I haven't figured out if I need to preserve the 'old' way of processing data if I need to add a field or change the process. I may need to re-process old files so I may need to retain that information.

    Another thing you might run into (very frustraing for me) is when you link to Excel they are times when Excel 'thinks' a field is a number but it is actually a string. In my case it comes up quite regularly working with zip codes. There is no way to override (that I know of) what Excel thinks and as a result data is not imported properly. So I convert everything I get to a CSV (for me it is not many).

    Hopefully something I have mentioned will help you. Feel free to ask if you want anymore detail on certain aspects. Keep in mind that this is all very theoretical right now, my project is not complete (90%). But this is a rewrite of a project that I wrote a few years ago.

  4. #4
    Join Date
    Mar 2003
    Location
    The Bottom of The Barrel
    Posts
    6,102
    Provided Answers: 1
    Quote Originally Posted by DCKunkle
    Another thing you might run into (very frustraing for me) is when you link to Excel they are times when Excel 'thinks' a field is a number but it is actually a string. In my case it comes up quite regularly working with zip codes. There is no way to override (that I know of) what Excel thinks and as a result data is not imported properly. So I convert everything I get to a CSV (for me it is not many).

    .
    Try these:

    Hkey_Local_Machine/Software/Microsoft/Jet/4.0/Engines/Excel/ImportMixedTypes
    Hkey_Local_Machine/Software/Microsoft/Jet/4.0/Engines/Excel/TypeGuessRows

    Mine are set to Text and 0 accordingly.
    oh yeah... documentation... I have heard of that.

    *** What Do You Want In The MS Access Forum? ***

  5. #5
    Join Date
    Apr 2004
    Location
    outside the rim
    Posts
    1,011
    ok, so I will look into dynamic queries. I am thinking something along the lines of:
    1) save the attachment to the temp folder
    2) create a linked table on the fly using the cross-reference table as the driver. I can add to this table a note of format (csv, xls, etc.)
    3) create the update query dynamically
    4) run the append

    I've been getting around the format guessing by Excel by explicitly defining how I want the file opened. For example:
    Code:
    .Workbooks.OpenText FileName:=sFile, Origin:=xlWindows, StartRow:=1, DataType:=xlDelimited, _
                            TextQualifier:=xlDoubleQuote, ConsecutiveDelimiter:=False, Tab:=True, Semicolon:= _
                            False, Comma:=False, Space:=False, Other:=False, FieldInfo:=Array(Array _
                            (1, 2), Array(2, 3), Array(3, 2), Array(4, 2), Array(5, 2), Array(6, 2), Array(7, 2), _
                            Array(8, 2), Array(9, 2), Array(10, 2), Array(11, 2), Array(12, 2), Array(13, 2), _
                            Array(14, 2), Array(15, 2))
    The second value in the Array parameter is xlTextFormat, which is 2. Another way to correct zip codes that lose the leading zero is to use the Format command after it's been imported, although that's more risky because it assumes the data had a leading zero to begin with.

    thanks,
    tc

  6. #6
    Join Date
    Feb 2004
    Location
    Chicago, IL
    Posts
    1,312
    I am just curious, how are you going to do #2? I was trying to figure out how to do it and the only way I thought of was to create File Specifications on the fly using the MSysIMEXColumns and MSysIMEXSpecs tables.

    I actually never tried it that way, I only considered it. My layouts don't change real frequently (once or twice a year for most programs) so I figured I would make the changes myself. But if you had a slicker way and were willing to share, I would definitely be interested.

  7. #7
    Join Date
    Apr 2004
    Location
    outside the rim
    Posts
    1,011
    I don't have a slicker way yet
    I am working on it now, so if anybody has an idea, please post!

    tc

  8. #8
    Join Date
    Apr 2004
    Location
    outside the rim
    Posts
    1,011
    The closest I came to dynamically linking to the Excel data is possibly by using ADO.

    Note that I've been avoiding ADO all these years, so I didn't get very far with it.

    There is also scant info on revising the MSysIMEXColumns and MSysIMEXSpecs tables.

    I was fiddling with the following code, just to play around. I want to "create" a linked table and skip all the import spec stuff. I kept stumbling on the field definition. If I skip the field definition, I get an error "No fields". If I try to define fiedls, I get an error "Invalid field type" and then Access crashes.

    I tried setting HDR=YES (which tells Access t use the first row as field names) and it failed just the same.

    Here's where I left off, maybe somebody will see the err of my ways:
    Code:
    Private Sub Command0_Click()
    
        Dim objExcel As Excel.Application
        Dim tblDef As DAO.TableDef
        Dim fld As DAO.Field
        Dim fldNew As DAO.Field
        Dim rsKey As DAO.Recordset
        Dim rsLeads As DAO.Recordset
             
        Set tblDef = CurrentDb.CreateTableDef("TempCamp")
        tblDef.Connect = "Excel 5.0;HDR=NO;IMEX=2;DATABASE=C:\LeadTestB.xls"
        
        Set rsKey = CurrentDb.OpenRecordset("Select * from tblImportMaster where ImportCampaignKey = 2 Order By ImportFieldOrder")
        Set rsLeads = CurrentDb.OpenRecordset("Select * from tblAFuelLeads")
    
        Do
            Set fld = rsLeads.Fields((rsKey("ImportFieldIndex")))
            Set fldNew = New DAO.Field
            tblDef.Fields.Append fldNew
            fldNew.Name = fld.Name
            fldNew.Type = fld.Type
            If fld.Type = dbText Then fldNew.AllowZeroLength = fld.AllowZeroLength
            tblDef.Fields.Refresh
            rsKey.MoveNext
        Loop Until rsKey.EOF
    
        CurrentDb.TableDefs.Append tblDef
        
        Set rsKey = Nothing
        Set tblDef = Nothing
        
    End Sub
    The red line is where it crashes.

  9. #9
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    Perhaps a hetergeneous query is in order?

    Code:
    SELECT *
    FROM [TheWorksheet$] IN '' [Excel 5.0;HDR=YES;IMEX=2;DATABASE=C:\SomeStuff.xls;TABLE=TheWorksheet$];
    Any use?
    Testimonial:
    pootle flump
    ur codings are working excelent.

  10. #10
    Join Date
    Apr 2004
    Location
    outside the rim
    Posts
    1,011
    pootle, that's ingenious.

    I am not familiar with this type of query (although I like it already).

    Do you know what the "IMEX=2" parameter is?

    Can I set it to start at, say, the third row. What is happening to me is that some of my source files begin the data on the third row and this query call starts at either line 1 or 2, depending on the HDR setting. Either way, I end up with a sheet full of #NUM! values for all of the non-text fields.

    The only thing that worries me is that I am relying on Access (or Excel?) to determine the data type in each column.

    I think this will work - modify previous post to:

    1) Open the 'cross-reference' to a recordset and use it to craft an SQL statement that selects from the above query and appends to the main table
    2) Save the attachments to temp folder
    3) Create query that links to Excel on the fly
    4) use RunSQL to execute

    Loop 2 thru 4 for all emails.


    tc

  11. #11
    Join Date
    Apr 2004
    Location
    outside the rim
    Posts
    1,011
    DCKunkle, here's the code I used to test:
    Code:
        Dim qry As DAO.QueryDef
        Dim sT As String
        
        sT = "SELECT * FROM [Complete List$] IN '' [Excel 5.0;HDR=NO;IMEX=2;DATABASE=C:\10k.xls;TABLE='Complete List'$];"
        
        Set qry = New DAO.QueryDef
        qry.SQL = sT
        qry.Name = "qrtTest"
        CurrentDb.QueryDefs.Append qry
        Set qry = Nothing

  12. #12
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    Quote Originally Posted by tcace
    pootle, that's ingenious.

    I am not familiar with this type of query (although I like it already).
    One of my All Time top 5 Access tools, the hetergeneous\ distributed query.
    Quote Originally Posted by tcace
    Do you know what the "IMEX=2" parameter is?
    Er..... I think it is to do with how the program is to handle alphanumerics and numerics but I'm not sure how or what the setting really means.
    Quote Originally Posted by tcace
    The only thing that worries me is that I am relying on Access (or Excel?) to determine the data type in each column.
    If you have a table specifying the spreadsheet set up perhaps you could build up a refined version of the query? e.g.
    Code:
    SELECT CInt(Field1) AS TheNumber, Field2 AS TheString, CDate(Field3) AS TheDate
    FROM BlahDeBlahBlah
    Quote Originally Posted by tcace
    Can I set it to start at, say, the third row. What is happening to me is that some of my source files begin the data on the third row and this query call starts at either line 1 or 2, depending on the HDR setting. Either way, I end up with a sheet full of #NUM! values for all of the non-text fields.
    I will have a play.
    Testimonial:
    pootle flump
    ur codings are working excelent.

  13. #13
    Join Date
    Apr 2004
    Location
    outside the rim
    Posts
    1,011
    Well, I've actually created somewhat elegent code, and it is SUBSTANTIALLY faster than before (an operation that used to take 5 to 10 minutes now takes under 30 seconds).

    I handled the "start at a row other than 1 or 2" by deleting the initial rows that are meaningless before saving the spreadsheet to the temp folder. I had to open each file anyway to rename the first worksheet.

    Most dynamic queries became simply dynamic SQL run directly from code. The "Cross Reference" table makes a very nice switchboard:
    Code:
        ' Open the Import Master and create the cross-reference update query
        Set rsRef = CurrentDb.OpenRecordset("Select * From tblImportMaster Where ImportCampaignKey = " & rsSet("SetKey") & " Order By ImportFieldOrder")
        sUpdateSQL = "INSERT INTO tblAFuelLeads ( "
        sSQL = "SELECT "
        Do
            sUpdateSQL = sUpdateSQL & rsLeads.Fields(rsRef("ImportFieldIndex")).Name & ", "
            sSQL = sSQL & "ImportLeads.F" & Format(rsRef("ImportFieldOrder")) & ", "
            rsRef.MoveNext
        Loop Until rsRef.EOF
        sUpdateSQL = sUpdateSQL & "Campaign, EmailBatchSubject ) "
        sSQL = sSQL & rsSet("SetKey") & " AS Campaign, '" & Nz(rsTemp("Text2"), "No subject") & "' AS EmailBatchSubject"
        sUpdateSQL = sUpdateSQL & sSQL & " FROM ImportLeads WHERE ImportLeads.F1 Is Not Null;"
        Set rsRef = Nothing
    "ImportLeads" is the dynamically written heterogeous query and this is run in code. Very tidy.

    Thanks pootle. btw, what are the other 4 things on your top five list?
    Quote Originally Posted by pootle
    One of my All Time top 5 Access tools, the hetergeneous\ distributed query.
    Should I find myself in England, or yourself in the States, I must buy you a beer or two!

    tc

  14. #14
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    Um..... not that I just pour nonsense onto the page or anything but I'll have a think about that one.

    Super - I will trade you all the virtual beers I owe to people on this board
    Testimonial:
    pootle flump
    ur codings are working excelent.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •