Results 1 to 11 of 11
  1. #1
    Join Date
    Sep 2003
    Posts
    102

    Unanswered: Bulk Insert ignoring Uniqueness constraint ?

    Hi,

    I"m trying to use a BULK INSERT command to insert data into a table from a file.
    There is a UNIQUE Index that is being violated and the BULK INSERT fails.

    I do not want to drop or disable the index, however, i also do not want to load 'duplicate' records so i keep the CHECK_CONSTRAINTS parameter.

    Is there a way to have the duplicate records outputed to the ERRORFILE ?

    Thanks,

  2. #2
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Provided Answers: 1
    I always recommend importing to a staging table first, regardless of whether you are using SSIS, BCP, or some other import wizard.
    Then run a sproc to check the data before importing it into your production table.
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

  3. #3
    Join Date
    Aug 2008
    Posts
    147
    Have you checked the ignore duplicate key option on the Unique Index. This causes the duplicates to silently fail.
    ---------------------------------
    www.sqlserver-dba.com
    www.dba-db2.com

  4. #4
    Join Date
    Sep 2003
    Posts
    102
    Thanks for the input!

    This is being imported to a staging table, but requirements have the offending field as unique. However, as the input file is not in our scope, it could have duplicate records and I would not want the BULK INSERT to fail.

    I tried the ON DUPLICATE KEY IGNORE but do not believe it (or have just been unsuccessful) can be used with BULK INSERT.

  5. #5
    Join Date
    Oct 2009
    Location
    221B Baker St.
    Posts
    486
    Depending on what kind of processing is acceptable on your system, you might consider eliminating the duplicates Before attempting the Bulk Insert.

    If an audit trail of the duplicates discarded is needed they could be placed in a "dups" file rather than just being tossed.

  6. #6
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Provided Answers: 1
    Quote Originally Posted by ontheDB View Post
    This is being imported to a staging table, but requirements have the offending field as unique.
    There should be no requirements on your staging table. The requirements are on your production table.
    The staging table is where you check for duplicates, relational integrity, data domains, etc.
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

  7. #7
    Join Date
    Sep 2003
    Posts
    102
    Hi,

    I agree with that typically Staging table is used as a Staging area where validations etc are performed on raw data.

    However, in our case, we are looking to take a shortcut to save time by using the unique index key to perform a validation task...... if possible.

    So assuming, we want to do this...is it possible ? or is it just a part of the design of BULK INSERT ?


    thanks!

  8. #8
    Join Date
    Oct 2009
    Location
    221B Baker St.
    Posts
    486
    One of the assumptions of a "Bulk Insert" (at least everywhere i've used one) is the the file to be used as input contains only valid entries . . .

    Sometimes, these tools allow skipping certain conditions, but my groups always ensured valid data before the "load".

    fwiw

  9. #9
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    I'm about 85% certain that the ERRORFILE parameter will do what you want, sending the rows with duplicate key values into the error file. I haven't tried this in a long time, but I'm pretty sure that it worked for me at one time.

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  10. #10
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Provided Answers: 1
    Quote Originally Posted by ontheDB View Post
    However, in our case, we are looking to take a shortcut to save time...
    Seriously.....how much extra time is involved for query processing, and how much time is going to be lost by not having a robust and error tolerant process?
    Seriously.....
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

  11. #11
    Join Date
    Sep 2003
    Posts
    102
    Quote Originally Posted by blindman View Post
    Seriously.....how much extra time is involved for query processing, and how much time is going to be lost by not having a robust and error tolerant process?
    Seriously.....
    Admittedly, performance knowledge is something I am lacking.

    Some details:
    This is part of a daily batch process.
    approximately 1.5 million records will be in the 'BULK INSERT'.


    The 'load file' is agreed to be have only valid records, but as it is coming in from a third party, i'd like to have some error checking if it does not cost too much overhead.

    I would not want to perform a validation step on each batch process to catch errors that are unlikely to happen.

    I initially assumed, the ERRORFILE would catch the duplicate keys, but when I tested it, it did not.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •