Results 1 to 4 of 4
  1. #1
    Join Date
    Nov 2002

    Unanswered: speed up large processing tasks

    This is question about how to reduce the overall duration of a large set of queries. Currently, all these queries are run sequentially from a spreadsheet and it take a long while to complete (eg. 9 hours). The datasets are fairly large but there's a lot of analytic stuff going on - such as nested NTILE queries so quite processeor and buffer cache intensive too. I can already see that it makes sense to take the queries, optimise them where possible and put them into stored procedures. Most of these queries can occur in any order - no depedencies. There's plenty of budget to buy a dedicated server with tons of memory and processing power. However, just buying a dedicated server isn't a solution without finding a way to get those processes kicked off in multiple streams. So, I'm looking at ways to kick off large numbers of stored procedures at the same time. In this way, the overall duration will be greatly reduced and the resources on a dedicated server can really be taken advantage of. The question is, how can you best start a large number (eg. 100 or more) stored procedures at the same time?

    Seems like a dumb question perhaps but, wihtout restoring to CLR perhaps, I can't see a way to do it. In programming languages like C++, you can create parallel process threads of course but T-SQL doesn't offer that facility as far as I know. Ideas?


  2. #2
    Join Date
    May 2004
    couple methods come to mind.

    1. use a batch file and kick them off asynchronously with lines like this in the batch file:

    start sqlcmd -Q "your query here" -d YourDatabase -S YourServer -E

    2. call the procs from an agent job, then start the job with call to msdb.dbo.sp_start_job. this does the work in a separate thread in agent - it's asynchronous.

    3. write your own app to do it in compiled code as you already mentioned.

    Note that if you kick off 100 different queries at once, this may take longer than running all 100 in series depending on contention. Particularly if you are doing inserts/updates/deletes to the same tables - these all require exclusive locks so there will be blocking.

  3. #3
    Join Date
    Nov 2002
    Thanks for the suggestions. The one I hadn't thought of was using sqlcmd batches. I did consider the jobs idea but there would be too many jobs - impractical.

    I've read that Service Broker could be used to call sprocs in parallel and this would be good in terms of putting a limit on how many sprocs could be run at the same time - a queue servicing a max number of requests at the same time. ADO.NET also allows for running sprocs in parallel and suppose this would be the managed code route.

    Yes, I take your point about parallel execution not necessarily saving any time but I think it would in this case. I can find ways of working round any locking/bocking that occurs because I have the freedom in this case to adjust the db design (there isn't a db design right now!).

    Thanks again,

  4. #4
    Join Date
    Jul 2008
    I'd use DataRush. It's a Java framework for speeding up data crunching by taking advantage of the other cores. Its a free download so I would suggest checking them out.
    Last edited by JamesF; 07-07-08 at 16:43.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts