Results 1 to 12 of 12
  1. #1
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912

    Unanswered: SS2K5 64bit Dualcore CPUs maxing out

    Hi

    Settled into the new giggero now.

    Since I have moved here we have upgraded our 4 SQL Servers to 2005 64 bit. All 2 proc dual core hyper threaded with 4GB RAM. Windows 2003. SQL Server compatibility level is 90. SP 1 and the CTP for SP2 are installed. We are hanging on with sp2. Maxdop is 0 (but has been tried on 2). Up to a score or so databases per machine, between 8 and 300 GB in size. Offline, reasonably heavy duty ETL batch processing.

    Since upgrading we have had trouble with the CPUs intermittently maxing out at 100% usage. This is when intensive routines are run. We know that many of these routines are crappy (SELECT... INTO creating tables with 10s or 100s of millions of rows, cursors etc.) and we are rewriting the code. Estimated time of completion of code review and rewrite is end of next year so we can't wait that long. The same poor code ran (albeit slowly) in SS2k but is causing the server to lock up now it is being run on SS2k5. Symptoms include connection attempts timing out, routines running very slowly. Existing connections can continue to make requests but these are very slow. If we get on the box the CPUs are at a never wavering 100% utilisation. This continues until intervention is taken, often restarting the service. We have left a maxed out box alone over a weekend and it was still in the same state when we came in on Monday. Nothing to glean from the error logs (AFAIK).

    Obviously the long term fix is to get the code sorted but this is an ongoing process (in fact it is one of the reasons I was hired). Our immediate concern is to get the servers back up to the sort of performance we were realising before we upgraded.

    Does anyone have any ideas?
    As ever eternal gratitude and a crate of virtual beer for any help
    Testimonial:
    pootle flump
    ur codings are working excelent.

  2. #2
    Join Date
    Jan 2003
    Location
    Massachusetts
    Posts
    5,799
    Provided Answers: 11
    If you have a test system that you do not mind killing processes on, you can try picking off individual connections until the CPU recovers. If you have a trace running at the time, you should catch what that process was doing, and you would be able to target your fixes a little better.

  3. #3
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    It is particularly the test\ dev machines that fall over. The Prod code is better. What do you suggest - set up a pc with profiler? Flick a few trace flags (and if so which?)?
    Testimonial:
    pootle flump
    ur codings are working excelent.

  4. #4
    Join Date
    Jan 2003
    Location
    Massachusetts
    Posts
    5,799
    Provided Answers: 11
    Just profiler. Catch the RPC Completed and SP Completed events, which should give you the reads, writes, duration, and number CPU ticks used. More importantly, the texdata field will tell you what query was being run at the time. If the procedure ends naturally, or by you killing it, it should show in the trace. You may get a lot of garbage to sort through, as well, so you may consider restricting the trace to an individual SPID when you go to kill it.

    And how is the Prod code better than the Dev code? Time to do a code refresh from Prod?

  5. #5
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    Thanks for the input.

    Sorry - to clarify the project is to upgrade a whole host of legacy, procedural code acting on file based hierarchical databases to SQL Server. Some of the early work was not so hot. Some of these databases were put in to testing and found very wanting (even before these specific problems) and so have not been released to prod.
    Testimonial:
    pootle flump
    ur codings are working excelent.

  6. #6
    Join Date
    Jan 2003
    Location
    Massachusetts
    Posts
    5,799
    Provided Answers: 11
    Pootle. Do you use Full Text Indexing at all?

    http://www.sqlservercentral.com/colu...pearl/2819.asp

  7. #7
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    Not us sir but thank you for the idea
    Testimonial:
    pootle flump
    ur codings are working excelent.

  8. #8
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    Did I mention that the memory usage will flatten out at about max too?

    We are not the only people to experience this sort of thing e.g.:
    http://forums.microsoft.com/MSDN/Sho...40746&SiteID=1

    We are looking at working through some of the suggestions but... throwing memory at the box is a bit brute force. The MS guys are giving fairly inconsistent advice.

    Ah well.
    Testimonial:
    pootle flump
    ur codings are working excelent.

  9. #9
    Join Date
    Jan 2003
    Location
    Massachusetts
    Posts
    5,799
    Provided Answers: 11
    The thread you posted seems to imply that SP2 fixes this problem. Any luck?

  10. #10
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    Hi

    We have installed a few gig extra memory into one of the prod boxes. SP2 is due to be installed shortly.... (the hotfix so early on made us a bit wary).

    Based on this:
    http://www.sqlteam.com/item.asp?ItemID=26788
    we have rebuilt all indexes and updated stats with full scan on one db and found a three fold improvement for the ETL routine on that. We are in the process of running this on all our other databases but at 18+ hours per db (one has been running for over 24 hours now) it will take a little while to cascade this throughout our environment. We haven't actually had a crash since I posted (one load failed but this was a data corruption in the base table index) but we have had some very sluggish periods (bulk inserts in 1000 row batches ranging from a few hundred milliseconds to 10 mins per batch). These are all (effectively) off-line so we can limit access to ensure priority work does not need to compete for resources.

    If we get a crash again I will set up the traces - for now we are concentrating on trying some of the suggested fixes\ optimisations.

    Will keep you posted.
    Testimonial:
    pootle flump
    ur codings are working excelent.

  11. #11
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    Ok, you asked for my comments, but you won't like them!

    First, more RAM. There is no bottom to SQL 2005, it can use everything I've been able to throw at it, and it gets faster every time I add RAM. This helps a ton in almost every case.

    Next, the reindexing is a great idea. It helps a lot too.

    My guess is that you've got a couple of tables with really poor choices of clustered index, and you may have some really bad choices of PRIMARY KEY definitions. These are usually the "big hammer" in terms of design problems. Set your primary key definition to a single column (either IDENTITY or GUID) if you can, and whatever you use, smaller (in bytes) is better. Make sure that your clustered index presents rows in the order that they are most frequently processed, regardless of whether that is unique or even seems to make sense from an ER perspective. If you can get composite sub-keys out of the clustered index, that is a bonus!

    Without more detailed knowledge of what you have, what you are doing, and what are your "pain points", this is the best general advice I can give.

    -PatP

  12. #12
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    Thanks Pat

    We are considering doubling the RAM in all servers. Index design is something we intend to review at the same time as the code since we know that that is lacking in places too.

    For the immediate future we are keen to get things running at pre-upgrade level which will probably be big hits like RAM, update stats etc. rather than the granular code & design analysis.
    Testimonial:
    pootle flump
    ur codings are working excelent.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •