Results 1 to 5 of 5
  1. #1
    Join Date
    Oct 2011
    Location
    Hamburg, Germany
    Posts
    18

    load tab separated file with external table

    Hey,

    i have an external table für tab separated files

    Code:
    create table external_example (
      col1   Number
      ,col2  Varchar2(20)
      ,col3  Varchar2(20)
    )
    ORGANIZATION EXTERNAL
    (
      TYPE ORACLE_LOADER
      DEFAULT DIRECTORY DATA
      ACCESS PARAMETERS
      (
        RECORDS DELIMITED BY NEWLINE
        BADFILE 'BadFile.bad'
        LOGFILE 'LogFile.log'
        NODISCARDFILE
        FIELDS TERMINATED BY 0x'9'
        MISSING FIELD VALUES ARE NULL
        (
          col1   
          ,col2  
          ,col3
        )
      )
      LOCATION ('file_with_tabs.txt')
    )
    ;
    files is looks like
    #unimportant header#
    1[\t]Text1[\t]Text2
    2[\t]qwerty[\t]qwer
    3[\t]Text3[\t]Text4
    4[\t]Text Text[\t]Text5

    [\t] = Tab character = x09
    since friday i get a file wich contains data like that can't be parsed by external table. there 3 columns
    1. 1
    2. null
    3. Text1 text[\t]text

    #unimportant header#
    1[\t][\t]"Text1 text[\t]text"


    [\t] = Tab character = x09
    look at a tab character in the last field!

    The first idea was to use the function
    OPTIONALLY ENCLOSED BY '"'

    but this does not work with tabs and other whitspaces correctly as I found out later.
    in metalink is an Doc:414394.1


    SQL*Loader treats blanks, tabs and newlines as whitespace. When whitespace is encountered the current field is terminated.

    The reason for the skipping of the tabs is the inclusion of the ENCLOSURE delimiter. If either mandatory or optional enclosure delimiters are present loader will skip all whitespace (blank, tab) until it hits a character. If the enclosure is optional that character need not match the enclosure char, but if it's mandatory that character must match or the row will be rejected.

    What is happening here is that the 2nd tab is considered as null character column is reached and then skipped because of the optional enclosure char, which causes the Loader to consider null character to insert for date field.
    does someone a solution for this problem?

  2. #2
    Join Date
    Feb 2006
    Posts
    172
    You can strip the double quotes(") before processing, but you've got another problem, looks like whoever supplies you the file has changed the format, I recommend getting them to fix the formatting or telling you what the new format is going to be, because now you have four fields instead of three if the field separator should always be a TAB.
    If they are now going to have alphanumeric fields enclosed with double quotes(") and the fields can have embedded TAB's then they will need to change the field separator to something else, Maybe a pipe symbol(|).

  3. #3
    Join Date
    Jun 2003
    Location
    West Palm Beach, FL
    Posts
    2,681

    Thumbs down

    This "whitespace" issue is the reason we all prefer to use some other terminator character like pipe (|) or caret (^).
    The person who says it can't be done should not interrupt the person doing it. -- Chinese proverb

  4. #4
    Join Date
    Oct 2011
    Location
    Hamburg, Germany
    Posts
    18
    Quote Originally Posted by LKBrwn_DBA View Post
    This "whitespace" issue is the reason we all prefer to use some other terminator character like pipe (|) or caret (^).
    since friday i do it too
    but the problem is, we get this files from a data provider, so maybe it will be difficult to change a format.

    i thought that maybe someone has a solution for this problem

  5. #5
    Join Date
    Jun 2003
    Location
    West Palm Beach, FL
    Posts
    2,681

    Cool

    Quote Originally Posted by shureg View Post
    since friday i do it too
    but the problem is, we get this files from a data provider, so maybe it will be difficult to change a format.

    i thought that maybe someone has a solution for this problem
    The only solution is to load (or create external table) where each record is one varchar column and parse it yourself.

    BEST solution is to format the data with correct field delimiters at the source.
    Tell data provider: 'We CANNOT process your data "as is"...'
    The person who says it can't be done should not interrupt the person doing it. -- Chinese proverb

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •