Results 1 to 12 of 12
  1. #1
    Join Date
    May 2010
    Posts
    5

    Unanswered: ssis bulk insert using ascii text and an xml data column

    I'm trying to map an ascii text file via ssis and need to have any extra data columns loaded into an xml data type data column. So, if our standard data file layout is 40 data columns and the customer sends in 45 data columns, we want to take these extra 5 columns and put them into an xml data column in our data table. Each customer may be different however. One may send in 1 extra data column and another could send in 45 extra data columns.
    Is there a way for SSIS do this, or is there another way to accomplish loading 1..n extra data columns?
    Last edited by JimOG; 05-03-10 at 16:16. Reason: changed title

  2. #2
    Join Date
    Jan 2007
    Location
    UK
    Posts
    11,434
    Provided Answers: 10
    Sadly there is no clean solution: SSIS binds metadata at design time, so changed file formats will cause it all to fail...

    Try loading your file contents in to a single column table and then manipulate it using T-SQL.

    I'll see if I can knock up an example of what I mean...
    George
    Home | Blog

  3. #3
    Join Date
    Jan 2007
    Location
    UK
    Posts
    11,434
    Provided Answers: 10
    This is by no means a polished solution (the split CTE has bugs) but it should illustrate my point well enough methinks. Anyhow, hope it helps.
    Code:
    IF Object_ID('tempdb..#staging') IS NOT NULL
      BEGIN
        DROP TABLE #staging
      END
    
    -- Create dumping table with arbitrary surrogate key
    CREATE TABLE #staging (
       id   int identity
     , data varchar(max)
    )
    
    -- Test data
    INSERT INTO #staging (data) VALUES ('one1,two1,20100504,four1,five1,6')
    INSERT INTO #staging (data) VALUES ('one2,two2,20101211,four2,five2,6,7,eight3')
    INSERT INTO #staging (data) VALUES ('one3,two3,20109999,invalid,date,6,7,eight3,9,ten3')
    
    DECLARE @delimiter varchar(2)
        SET @delimiter = ','
    
    -- Split, based on delimiter, each row in to multiple rows
    ; WITH split AS (
      SELECT id
           , SubString(data, CharIndex(@delimiter, data) + Len(@delimiter), Len(data)) + @delimiter As data
           , Left(data, CharIndex(@delimiter, data) -1) As value
           , 1 As level
      FROM   #staging
        UNION ALL
          SELECT a.id
               , Right(b.data, Len(b.data) - CharIndex(@delimiter, b.data))
               , Left(b.data, CharIndex(@delimiter, b.data) - Len(@delimiter))
               , b.level + 1
          FROM   #staging As a
           INNER
            JOIN split As b
              ON b.id = a.id
          WHERE  CharIndex(@delimiter, b.data) > 0
    )
    , pvt AS ( -- Pivot the resultant columns - note I am only doing six because these are the common columns I'm interested in
      SELECT id
           , Max(CASE WHEN level = 1 THEN value END) As col1
           , Max(CASE WHEN level = 2 THEN value END) As col2
           , Max(CASE WHEN level = 3 THEN value END) As col3
           , Max(CASE WHEN level = 4 THEN value END) As col4
           , Max(CASE WHEN level = 5 THEN value END) As col5
           , Max(CASE WHEN level = 6 THEN value END) As col6
      FROM   split
      GROUP
          BY id
    ) -- Convert types and what not giving final result
    SELECT col1
         , col2
         , Convert(datetime, CASE WHEN IsDate(col3) = 1 THEN col3 END) As col3
         , col4
         , col5
         , Convert(int, CASE WHEN IsNumeric(col6) = 1 THEN col6 END) As col6
    FROM   pvt
    George
    Home | Blog

  4. #4
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    Another slow day?

  5. #5
    Join Date
    Jan 2007
    Location
    UK
    Posts
    11,434
    Provided Answers: 10
    This might be a tad better. No idea about perf, but it doesn't have seem to be buggy and is easier to follow IMHO.
    Code:
    IF Object_ID('dbo.parse_delimited_strings', 'TF') IS NOT NULL
      BEGIN
        DROP FUNCTION dbo.parse_delimited_strings
      END
    GO
    
    CREATE FUNCTION dbo.parse_delimited_strings (
       @string    varchar(Max)
     , @delimiter varchar(2)
    )
      RETURNS @values table (
         value    varchar(255)
       , position int
      )
    AS
      BEGIN
        ; WITH cte AS (
          SELECT 0 As [pos]
               , 1 As [level]
            UNION ALL
              SELECT Convert(int, CharIndex(@delimiter, @string, cte.pos + 1))
                   , level + 1
              FROM   cte
              WHERE  CharIndex(@delimiter, @string, cte.pos + 1) > 0
        )
    
        INSERT INTO @values (value, position)
        SELECT Convert(varchar(255), SubString(@string, a.pos + 1, Coalesce(b.pos - 1, Len(@string)) - a.pos)) As [value]
             , a.level
        FROM   cte a
         LEFT
          JOIN cte b
            ON a.level + 1 = b.level
    
        RETURN
      END
    GO
    
    -- Split 'em
    SELECT a.id
         , b.position
         , b.value
    FROM   #staging As a
     CROSS
     APPLY dbo.parse_delimited_strings(a.data, ',') As b
    
    -- Pivot
    SELECT id
         , Max(CASE WHEN position = 1 THEN value END) As col1
         , Max(CASE WHEN position = 2 THEN value END) As col2
         , Max(CASE WHEN position = 3 THEN value END) As col3
         , Max(CASE WHEN position = 4 THEN value END) As col4
         , Max(CASE WHEN position = 5 THEN value END) As col5
         , Max(CASE WHEN position = 6 THEN value END) As col6
    FROM   (
            SELECT a.id
                 , b.position
                 , b.value
            FROM   #staging As a
             CROSS
             APPLY dbo.parse_delimited_strings(a.data, ',') As b
           ) As x
    GROUP
        BY id
    George
    Home | Blog

  6. #6
    Join Date
    Jan 2007
    Location
    UK
    Posts
    11,434
    Provided Answers: 10
    Quote Originally Posted by pootle flump View Post
    Another slow day?
    Not particularly; just had 5 minutes to kill while provisioning my latest dev env.
    George
    Home | Blog

  7. #7
    Join Date
    Jan 2007
    Location
    UK
    Posts
    11,434
    Provided Answers: 10
    Ok, maybe more like 12 minutes, but still...
    George
    Home | Blog

  8. #8
    Join Date
    May 2010
    Posts
    5
    thanks for the input - even if it was only 12 minutes

  9. #9
    Join Date
    May 2010
    Posts
    5

    how do you dynamically map the data file

    the example provided here shows how to parse muliti-length records from a table, but I'm not seeing how to import a fixed-length ascii text, pipe delimited data file that differs in lengths by payer. I've looked at sql servers Openrowset, but seems to have a limitation of pulling all the data as
    SINGLE_BLOB | SINGLE_CLOB | SINGLE_NCLOB
    If we have 50 million records in a data file, Sql server will choke if it attempts to pull all this data into a Clob/blob field. Next, I looked at creating a format file from the data file as the other option with openrowset, but I'm only finding examples of creating the format files from database tables. Is there a way to create a format file from a data file itself? Are there other options I should look at?

  10. #10
    Join Date
    May 2010
    Posts
    5
    I also tried:
    INSERT INTO #textImport
    ( line )
    EXECUTE MASTER..xp_cmdShell '\\d3719008\DataFiles\2_MemberNEW.txt'
    but the DBA's have turned off xp_cmdShell and are not willing to turn that feature on.

  11. #11
    Join Date
    Jan 2007
    Location
    UK
    Posts
    11,434
    Provided Answers: 10
    Have you tried importing the text files so that each "line" is a single record rather than the whole file?
    George
    Home | Blog

  12. #12
    Join Date
    May 2010
    Posts
    5
    are you referring to OpenRowSet? Do you have an example you could post?

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •