Results 1 to 12 of 12
  1. #1
    Join Date
    Oct 2003
    Posts
    268

    Unanswered: Status on Set Based Operations?

    Every time I mention cursors, I hear a lot of people pointing out how you shouldn't use cursors and how set based operations are much faster.

    However, procedural code offers the big advantage that I can embed status/progress type commands into a cursor loop and get accurate data on progress and ETA.

    If I execute a single huge UPDATE/INSERT, I am at the whim and mercy of SQL Server. Seemingly simple statements often run for 50 hours+ and I have no clue how far along it is or anything remotely close to an ETA. Often, that is absolutely inacceptable to have mission criticial hardware to be churning through a process for DAYS without the slightest idea if it is actually making progress or not.

    Is there any way to get progress/status on such an UPDATE/INSERT command?

  2. #2
    Join Date
    Aug 2004
    Posts
    8
    It's a matter of speed. Cursors are slow, and I think you admit this.
    It's true what you are saying: you have a good control of what's happening, but there are cases where you don't have enough processing time or resources for looping through a zillion rows recordset.

    If you use cursors, you may contemplate the opportunity of using procedural language environments (Java, C++, etc) which are designed for such kind of issues and - therefore - are much faster and flexible (manage memory partition, caching, etc, etc). Then a flat file should be all you need...

    You may make a compromise and slice your huge update/insert into smaller chunks, using temporary/ auxiliary tables for keeping the intermediate results.

    Ideally, you should have the proper database model and the correct indexing schema to be able to do it at once.How often this is happening in the real world, I don't know....

  3. #3
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    I'd suggest creating a set based solution, then limiting it to a small percentage (I'd start wtih about 10,000 rows) of the target table. By cutting down the number of rows affected, you can also experiment to see if you are getting the correct results before committing to running the whole query.

    The set based solution should run between a hundred and a hundred-thousand times faster than the row based solution. Cutting the problem down into smaller pieces should also make things noticably faster because of putting less resource demand on the server. The net result should be that you can see the incremental results, and also see them much sooner!

    -PatP

  4. #4
    Join Date
    Oct 2003
    Posts
    268
    Thanks you two!

    Good idea; organize things to run in batches. That will provide performance.

    I could easily use Java/C++ but theoretically cursor logic should run faster as a stored proc. The data doesn't have to be transferred out of SQL Server.

  5. #5
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    Cursors will often run faster than client side software, just because the data doesn't have to leave the server... Set oriented code will still run roughshod over cursors though.

    -PatP

  6. #6
    Join Date
    Oct 2003
    Posts
    268
    Quote Originally Posted by Pat Phelan
    Cursors will often run faster than client side software, just because the data doesn't have to leave the server... Set oriented code will still run roughshod over cursors though.

    -PatP
    A lot of people say set oriented code runs the best. The theory is nice and I'm sure in most cases it's true.

    However, I have a set based solution and a cursor based solution to the same problem. I've spent a lot of time on this and have watched the cursor code perform faster. I've posted the set based code in another thread, "Dedup Query", and have curiously gotten no responses. I included DDL and sample data so it should be really easy to test out...

  7. #7
    Join Date
    Nov 2002
    Location
    Jersey
    Posts
    10,322
    You 've watch code perform faster....


    Define fast?

    Last time I looked you have code that runs in days...not minutes...

    But don't take our word for it.....

    Listen...there probably is a perfomance window when a cursor will outperform set based processes...And I'm speaking to when a set based set of code is tuned...

    But it will be very hard to find...most of everything you have asked so far is about updating trillions of rows of data....we've suggested that you do that in chuncks of set based processes...which is in a loop....not a cursor...

    And why you have the need to update ALL of the data points to the fact the initial processes have a problem..hell if you don't want to fix it up from, I'd think I'd prefer to createw a view...in any case....

    I'm waiting for the day when you start to explore multi level cursor nesting....

    Sorry the RNC is in town.....and it's monday...

    over and out....

    01001001011101000111001100100000001101010011101000 11000000110000001000000111001101101111011011010110 01010111011101101000011001010111001001100101
    Brett
    8-)

    It's a Great Day for America everybody!

    dbforums Yak CorralRadio 'Rita
    dbForums Member List
    I'm Good Once as I ever was

    The physical order of data in a database has no meaning.

  8. #8
    Join Date
    Oct 2003
    Posts
    268
    Quote Originally Posted by Brett Kaiser
    And why you have the need to update ALL of the data points to the fact the initial processes have a problem..hell if you don't want to fix it up from, I'd think I'd prefer to createw a view...in any case....
    Thanks for the feedback Brett! My last statement was made largely out of frustration and business pressure.

    I think I've found the critical issue that is causing astronomical, dramatic performance problems: The problem is UPDATE statements against a large table. INSERT and SELECT statements are performing in reasonable down to earth time frames but the UPDATEs are not.

    For large tables, large scale UPDATE staements are completely impractical. it is actually much better to completely rebuild the table (INSERT/SELECT). This is very counterintuitive but it is an important concept to follow.

  9. #9
    Join Date
    Jul 2003
    Location
    San Antonio, TX
    Posts
    3,662
    Hey, Receipients table also needs to be partitioned, it will contribute to improved performance. But clustered index needs to be added (I don't remember if it exists there or not).
    "The data in a record depends on the Key to the record, the Whole Key, and
    nothing but the Key, so help me Codd."

  10. #10
    Join Date
    Apr 2004
    Location
    Kansas City, MO
    Posts
    734
    Brett, that binary thing is just out of control.
    MeanOldDBA
    derrickleggett@hotmail.com
    When life gives you a lemon, fire the DBA.

  11. #11
    Join Date
    Oct 2003
    Posts
    268
    Recipients has a clustered index on DomainID, UserName (basically email address) and has a non-clustered index on unique identity integer.

    I thought the plan was to leave Recipients unpartitioned and partition a new pre-joined, pre-cleaned MailingRecipients (name isn't final) table?

    Quote Originally Posted by rdjabarov
    Hey, Receipients table also needs to be partitioned, it will contribute to improved performance. But clustered index needs to be added (I don't remember if it exists there or not).

  12. #12
    Join Date
    Jul 2003
    Location
    San Antonio, TX
    Posts
    3,662
    Partitioning Recipients will benefit all action queries associated with it. For this to work correctly you'll have to remove a 2-field-based clustered and create a 1-field clustered intex. Exmple of improvement would be in both INSERT and UPDATE because only a specific range of records would be affected per filegroup.
    "The data in a record depends on the Key to the record, the Whole Key, and
    nothing but the Key, so help me Codd."

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •