I have a CSV with some columns containing quoted text with the text itself containing quotes.
I'm trying to use a Flat File Source in SSIS. I've set a double quote as the text qualifier (") and comma as the delimiter. The trouble I'm having is getting to text that contains a double quote like in the following example;
123,"xyz","Don't you hate when MS say's it's a "feature" and not a bug","more data"
SSIS is seeing the first quote as a text qualifier and the second quote as the end of the text. Because the next delimiter is not found (in this case a comma), it returns;
[staging_output ] Error: The column delimiter for column "Column 3" was not found.
Any ideas on how I can get SSIS to treat the whole column as a single piece of text that just happens to contain quotes?
If it was my CSV it wouldn't be an issue, but unfortunately it isn't and I don't get to dictate the format. It actually gets worse...there are 104 different tables with up to 220 columns all in the one file.
My approach it to use a Flat File Source, a Conditional Split to split by table, a Data Conversion to strongly type the data, and finally a SQL Server Destination.
I seem to have hit a stumbling block on this quote issue. I hate getting held up on stupid little things, but isn't it always the way. I swear I spend more time on stupid little things than on the REAL technical challenges.
Ok. It seems that there is no way to get Flat File Connection to deal with a CSV with embedded quotes.
I have been able to deal with the problem by using an OLE DB Connection with the Microsoft Jet 4.0 OLE DB Provider. This seems to be far better at dealing with CSVs. Be aware that you may also need to change some registry keys for the Text ISAM driver depending on the layout of your CSV. Being that I couldn't strict data type my data just yet, I had to ensure that the driver kept everything as text and didn't scan x number of rows to determine data type.
I have a very complex set of transformations to perform on the data before loading a data warehouse. Why on Earth would I want to "avoid the whole SSIS thing"?
How are you doing those transformations ? via sql blocks, sprocs, etc ?, A Stage then Load type of scenario ? I prefer to keep processes lowest level as possible, ie if I am simply staging a file/data, then performing sql operations on that data and then loading to final destination tables why get other interfaces involved ? To me it makes troubleshooting all that more difficult. Just a personal preference. I guess just a leftover from having to do things with only ksh, perl and BCP/SQLLoader,
The source CSV contains output from multiple tables, so I have to start with 220 text columns and perform a conditional split. From there I can start data typing. I have to stage the data and then load a data warehouse including managing slowly changing dimensions. All of this has to be appropriately managed and logged. SSIS is simply the tool for the job.
I hit the same issue also using SQL 2008/VS2008. On the wizard of the import you need to specify the Text Qualifier by default there is none. Place the double quotes in the box and problem is solved. Hope it helps.