Results 1 to 8 of 8
  1. #1
    Join Date
    Aug 2009
    Posts
    4

    Unanswered: bcp import to SQL Server 2008 R2 hangs

    I am investigating the intricacies of SQL Server having used it for years as a dumb back-end storage facility.

    I need to import several million records into a SQL table from a pipe-delimited text file. I have tried SSMS 'Import/Export' which is slow so I tried bcp from the command line.

    Importing records into an empty table with no indexes worked a dream. However..

    Tried to import 13 million records into an indexed table (yeah, I know...). After almost 50% of the import (6721000 records) bcp appears to have hung and the Server disk is flashing continuously.

    Any ideas please? I am going to leave this one overnight to see what happens... Could it be an index issue?

    Tomorrow I plan to drop the indexes - try again - and rebuild the indexes.

    All information gratefully received. Thanks.

    Dave W

  2. #2
    Join Date
    Jan 2003
    Location
    Massachusetts
    Posts
    5,799
    Provided Answers: 11
    Depends how large the table you are importing into is. You are loading 13 million records, so I will guess that the table is somewhat larger than that. In that case, the indexes have to be re-balanced. Suppose you have an index on a date field. All of the records in the index are sorted into a b-tree or balanced tree index. if all of the records you import have dates that are greater than the dates in the existing index, then the tree would become very unbalanced on the later side. To avoid having wildly different index depths for separate records, the b-tree gets re-balanced. Dropping the indexes will make the import of the data finish faster, but may make the overall import longer (assuming you rebuild those indexes after the fact).

    Another possibility is that the commit of all these records could be taking a significant amount of time. did you supply a batch size to the BCP command (-b parameter)?

    You can check some of these by looking at the wait_tyoe column in the sys.dm_exec_requests dynamic management view for the import session.

  3. #3
    Join Date
    Aug 2009
    Posts
    4
    Thanks for the reply. The data provided does not have a Unique identifier. When I used bcp to import it into a table with no indexes and no primary key all 13 million records imported fine. So I truncated the table and added a big int IDENTITY field and made it the primary key. Using the same bcp command still works fine but it terminates at 5.8 million records. I want to import the data into a table with an pre-defined IDENTITY field as a primary key and add non-clustered indexes to 3 further fields. I have over 30 text files that I need to import into tables and all have over 10 million records - each needs a primary key and between 1 and 3 non-clustered indexes.

  4. #4
    Join Date
    Aug 2004
    Location
    Dallas, Texas
    Posts
    831
    Still not sure why you don't add the pk and non-clustered indexes after the import?

  5. #5
    Join Date
    Aug 2009
    Posts
    4
    Thanks for the reply. Well.....

    1) adding a pk - on a table of 13 million records I could use ALTER TABLE and add a big int IDENTITY (1,1) and then put a clustered unique index on that?

    2) I haven't considered putting the additional indexes on yet - I haven't got that far.

    There is a time/storage issue to all of this - we need the data in tables to extract a sub-set of interest as soon as possible. The schemas available do not represent the actual data provided so I have to set all of the variables as VarChar(50). I am concerned about the storage/time overheads on adding an identity field to a table with 13 million records, plus a unique non-clustered index plus at least two other non-clustered indexes.

    One solution I thought about would be to create a temporary table with an identical schema plus the indexes and primary keys - copy the data into it and then rename the table. Inefficient but practical? Another solution I have seen on Forums but which I can't get to work is use the Alter Table ... Switch function but that only appears to change existing variables from IDENTITY - or have I misunderstood?

  6. #6
    Join Date
    Jan 2003
    Location
    Massachusetts
    Posts
    5,799
    Provided Answers: 11
    Is this data normally being imported into a table with no other data every time, or is it being imported into a table with pre-existing data? If this is an empty table every time, you can truncate the table (this will also reset the identity column), drop any indexes, do the import, then add the indexes back. The identity column can exist without being the PK, or indexed at all, so there is no need to drop that column.

  7. #7
    Join Date
    Aug 2004
    Location
    Dallas, Texas
    Posts
    831
    Quote Originally Posted by WinterDL View Post

    One solution I thought about would be to create a temporary table with an identical schema plus the indexes and primary keys - copy the data into it and then rename the table. Inefficient but practical? Another solution I have seen on Forums but which I can't get to work is use the Alter Table ... Switch function but that only appears to change existing variables from IDENTITY - or have I misunderstood?
    This is a staging table and can be a hash for fast loading. Doesn't BCP have a fast load option? Create another process to populate downstream tables with indices. I've created BCP imports before that loaded 10+ million rows in about a couple minutes.

  8. #8
    Join Date
    Aug 2009
    Posts
    4
    Quote Originally Posted by MCrowley View Post
    Is this data normally being imported into a table with no other data every time, or is it being imported into a table with pre-existing data? If this is an empty table every time, you can truncate the table (this will also reset the identity column), drop any indexes, do the import, then add the indexes back. The identity column can exist without being the PK, or indexed at all, so there is no need to drop that column.
    Thanks again. The table is always empty. Interestingly, when I added an identity column with a bigint type bcp only loaded 50% of the 13 million records and then terminated correctly. That was before any indexes were added to the table. Dave

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •