Results 1 to 3 of 3
  1. #1
    Join Date
    Nov 2003
    Posts
    33

    Question Unanswered: performance of bulk insert

    Hi,
    I have a database that holds an information about customers, each customer's data is holded in many referenced tables. In one stored procedure i should insert/update many customers. I have 2 options: bulk insert of all customers data in one transaction or insert each customer in separate transaction. My question is: what approach is considered to be correct, and gives better performance assuming that the number of inserted/updated customers is very large? Thank you
    Thank you
    Yulia

  2. #2
    Join Date
    May 2004
    Posts
    125
    How many are you looking at to insert or update?

    If it is a 2-5 thousand at a time, our dba lets us do that. If it gets larger then that, she usually likes us to do a cursor for the update or insert. Cursors will not lock the tables as much so it wouldn't hurt performance as much for the other processes running on the db for the larger updates. If you are running the process off hours and you do not have to worry about processes as much, you can use bulk insert and updates for larger number of rows.

  3. #3
    Join Date
    Jan 2004
    Location
    In a large office with bad lighting
    Posts
    1,040
    Quote Originally Posted by yuliam
    My question is: what approach is considered to be correct, and gives better performance assuming that the number of inserted/updated customers is very large?
    The answer is ... it depends.

    More properly, since you have identified the customer data exists in several tables (which infers relationships), you probably need to enforce ACID properties, which means transactional consistency for each customer insertion / update.

    Now the question becomes ... how much locking can you, your applications, and your users live with. You stated that "... the number of inserted/updated customers is very large ...". How large is very large? How high is up?

    If you use one transaction for each customer, you only retain the lock for the brief time needed to retain data and row consistency across ? tables. You could easily identify single customer datum failures, and write or flag the failure for follow-up. Your process will probably in row level locking (unless you start splitting data and / or index pages), however, if very large is on the order of millions of entries and several tables, your insert / update could take hours and lots of system resources to complete.

    If you try to batch them all together, you find that tracking a single datum failure becomes much more difficult, you will probably end up with page and table locking (which **will** affect application performance), and possibly a massive transaction log.

    Smaller batch processing (say 100 or 500 or 1000 or some other number) puts you between these two extremes but with still the same concerns.

    So ... it depends. Test and find out what is best for your situation. But remember, maintain consistency, or the data becomes unreliable, as does your reputation!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •