Results 1 to 5 of 5
  1. #1
    Join Date
    Jan 2003
    Location
    Pittsburgh, PA
    Posts
    86

    Unanswered: Importing multiple Excel Files

    I have a client who is sending me 800+ excel files each month with sales data. Each of the files is identical in structure, but has sales data for different stores. I receive all these files at the same time.

    Is there a method with Data Transformation Services where I can have it work off of all the files in a given directory. I can set up DTS to work off of specific Excel files with no problem, but what I would like to do is set up a DTS so it could pull from each of the 800+ files.

    Is this possible, or do I need to look at a solution outside of SQL to consolidate the Excel files first?

    The Excel file would have columns similar to the following: store_id, zip_code, sales, transactions.

  2. #2
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    There are lots of different ways to handle this kind of problem.

    By far my largest concern with 800+ Excel files is GIGO (Garbage In, Garbage Out). Even one sheet of undetected junque data, and you can't get a correct answer! Before I can even suggest a technical solution, I need to understand what the risks of bad data are for you, and how much manpower you are willing to commit to either detecting or correcting the bad data... That answer will strongly influence what I suggest for a solution.

    -PatP

  3. #3
    Join Date
    Jan 2003
    Location
    Pittsburgh, PA
    Posts
    86
    In terms of the quality of the data, it comes directly from my client, so as long as what I have stored in our database matches what they gave me, I am in good shape. I run some additional reports from our application to spot anomalies in the data quality.

    In terms of confidence in the file structure integrity: the client isn't very sophisticated (i.e. they cannot generate one file with all the stores in it), but their result file is very simple and routine. Last month was the first time I had gone through this process with them, and all 843 files were clean.

    In terms of manpower... I'm a one man technical staff for a company with 50--70 employees, and several clients, and the database management / design is only one part of my responsibilities. Considering I would be doing this particular task once a month, 30 min - 45 min of effort would hopefully be the max I would have to spend.

    Having done manipulation of customer / client files for several years now, I know to expect the unexpected; and GIGO is always one of my biggest concerns. Having no support staff to help, limits the effort I can make, but I need to know that what I received is what I loaded. If something cannot be loaded, identifying that would be great.

    I don't mind loading the data into a staging area... and then doing some post-load validations in SQL before moving the data to the intended tables in the database. From my perspective, the key is getting it out of the Excel format, which is terribly unfriendly for me to validate against, into a format / structure that I can.

    Does that help guide you?

  4. #4
    Join Date
    Jul 2003
    Location
    San Antonio, TX
    Posts
    3,662
    I am not sure about DTS, because scripting is my preference, so there's gotta be someone that would come up with a DTS solution (I've seen a couple of samples on the net that demonstrate ennumeration of files in a directory from a DTS/SSIS package).

    But in scripting world you can have a batch file that would dump a list of files into a file queue table, and then extract a script-like output into a TSQL script that you can run in the next step with osql/sqlcmd.

    When I had to deal with Excel files in the past, I ended up building linked servers on the fly for each file in order to be able to handle everything in TSQL (plenty of references on the net as to how to create a linked server to an excel spreadsheet).
    "The data in a record depends on the Key to the record, the Whole Key, and
    nothing but the Key, so help me Codd."

  5. #5
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Provided Answers: 1
    You can do this in DTS by defining a filename variable and using VB to cycle through the files. So what you do depends upon whether your comfort zone is in VB or in scripting.
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •