Results 1 to 11 of 11
  1. #1
    Join Date
    Jun 2005
    Posts
    319

    Unanswered: multi-threaded from with tsql?

    What is the best practice for launching a new process from within a proc. Do I need to get knee deep in Service Broker? Here is my code:


    Code:
    	WHILE @@FETCH_STATUS = 0
    	BEGIN
    		-- Loop once per hour block of data
    		
    		-- Setup the Import Partition for the given Hour
    		exec SetupImportPartition @cursorDate  
    
    		-- populates Import table for 1 hour's worth of data (for a given campaign)
    		exec HourlyImport @importDate = @cursorDate, @totalRecords = @outCount OUTPUT 		
    		
    		IF @outCount > 0
    		BEGIN
    			-- Setup the Hourly Partition for the given Hour
    			exec SetupHourlyPartition @cursorDate 
    			
    			-- Switch the data from Import into Hourly
    			exec SwitchImportIntoHourly
    			
    			SET @isThereAnyDataForDay = 1
    		END
    		-- ELSE Nothing to do
    		
    		FETCH NEXT FROM import_cursor INTO @cursorDate
    	END
    exec SetupHourlyPartition @cursorDate
    This line above (also in the code above) can take anywhere from 0 seconds to 1 minute and 20 seconds. It does two things, splits a new partition and then merges the last import batch into the hourly data. It's the merge that is expensive and I would like to split it off into it's own process, update a log table when complete and then check the results of the log table before the merge needs to happen again since that is the only piece of code that there is a dependency. I'd like the import processing to continue while this merge happens in the background.

  2. #2
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    I'd make a SQL Agent job that did the dirty deed, then use sp_start_job to launch the job. Instant asynchronous processing, and you don't even need to add milk!

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  3. #3
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    Buddy bobs - I don't believe that will work. Although sp_start_job will kick the job off asynchronously you can't have more than one instance of it running. As such each pass in the loop above will still have to run in series even if parts of the loop are asynchronous.
    Testimonial:
    pootle flump
    ur codings are working excelent.

  4. #4
    Join Date
    Jun 2005
    Posts
    319
    I think synchronous execution of the job is actually preferable since if a merge is in process I do not want to kick off another merge until the first one completes.

    Next question should be easier, how to check the status of the job via tSQL (looking to find out if it has completed processing). I assume I will just add a WAITFOR DELAY type of loop to poll for the results if this job is still running.
    Last edited by Gagnon; 10-06-10 at 13:09.

  5. #5
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    Create a table where you can store any run parameters and a column for the results of that run. Populate the table with new rows as the import process completes. Just have the job run every minute or three looking for work in your table. That way the import process just queues up work as it finds it, and the merge process processes work as it finds it... Everybody does their own job, at their own pace!

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  6. #6
    Join Date
    Jun 2005
    Posts
    319
    so here's the latest, I implemented the jobs but I keep running into this problem. I created 20 jobs so that we can run up to 20 in parallel, I have the code below to get a new jobid to run off of.

    Code:
    			SET @jobId = 0
    			
    			SET @breakLoop = 0	
    			
    			SET @startCheck = GETDATE()
    			SELECT @finishCheck = DATEADD(ss, (@pullJobs/2)*(@top/1000), @startCheck)
    
    			WHILE ( @jobId = 0 AND @startCheck < @finishCheck )
    			BEGIN
    				SELECT TOP 1 @jobId = H.batchSlot
    				FROM PullJobHistory H WITH (NOLOCK)
    				WHERE
    				H.batchSlot <= @pullJobs
    				AND H.batchSlot <> @lastJobId -- so we do not try to run the same job twice
    				AND NOT EXISTS
    				(
    					SELECT
    						1
    					FROM
    						PullJobHistory Z
    					WHERE
    						H.batchSlot = Z.batchSlot
    					AND	Z.isComplete = 0
    				)
    				SET @jobId = ISNULL(@jobId, 0)
    				IF @jobId = 0
    				BEGIN
    					WAITFOR DELAY '00:00:01'
    					SET @startCheck = GETDATE()
    				END
    				ELSE
    					SET @breakLoop = 1
    			END
    
    			IF @startCheck >= @finishCheck
    			BEGIN			
    				set @raiseErrorMessage = 'Job timed out: ' + cast(@startCheck as varchar(50)) +' | ' + cast(@finishCheck as varchar(50))
    				raiserror(@raiseErrorMessage, 15, 1)
    				RETURN
    			END
    
    			SET @lastJobId = @jobId
    The problem is that the code runs so fast that between spawning the job and checking to see what id is available I am getting collisions, code is trying to run the same job that is still executing resulting in this error:

    Msg 22022, Level 16, State 1, Line 0
    SQLServerAgent Error: Request to run job Pull_Camp_A_Batch_1 (from User sa) refused because the job is already running from a request by User sa.


    I guess what I need is wait for the last job to fully spawn before continuing with code execution. What is a simple query that I could use to poll for this particular job (to find out when the status changes to executing)? I could poll the table every second until it shows up and then allow the code execution to continue to ensure I do not outlap myself.

  7. #7
    Join Date
    Jun 2005
    Posts
    319
    I think I can figure it out from this info:

    /*
    One simple way is to query the msdb.dbo.sysjobactivity table for the job and check to see if stop_execution_date is null. If it is, the job is running. At least that's my understanding.
    sp_help_job returns a current_execution_status column. The problem with using this stored procedure is that it cannot be used in an insert...exec statement, because it has an insert...exec statement in it and these cannot be nested. You could go into the procedure and find the procedure it runs to get the job status, but that procedure is not documented.
    Ron Rice
    */

    SELECT * FROM msdb.dbo.sysjobactivity

    SELECT * FROM msdb.dbo.sysjobS

  8. #8
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    Quote Originally Posted by Gagnon View Post
    I guess what I need is wait for the last job to fully spawn before continuing with code execution. What is a simple query that I could use to poll for this particular job (to find out when the status changes to executing)? I could poll the table every second until it shows up and then allow the code execution to continue to ensure I do not outlap myself.
    I still kinda like my idea from post #5 in this thread.

    It completely avoids that bamboozle, and it is simple to code and use.

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  9. #9
    Join Date
    Jun 2005
    Posts
    319
    Quote Originally Posted by Pat Phelan View Post
    I still kinda like my idea from post #5 in this thread.

    It completely avoids that bamboozle, and it is simple to code and use.

    -PatP
    I am still doing that, just running 10 jobs simultaneously, I am storing the dynamic SQL in the table. So I queue up 10 jobs (up to 20 but 10 seems to strike a nice balance) and then take the TOP 1 job that frees up to give it more work.

    I was able to solve my problem, surprisingly querying the jobs table did not solve it. I ended up checking a date field that the job populates, once that is populated I allow the import process to continue.

  10. #10
    Join Date
    Jun 2005
    Posts
    319
    damn Pat, I thought about it more and you are right, much less headaches trying to see if a job is still running. Much easier to queue up the work in a table and have the jobs cherry pick what to work on. I'm actually assigning the work to each of the jobs so there will be no collisions. Thanks again.

  11. #11
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    I've been down that road, multiple trips. I've found LOTS of ways to hurt myself. Having a job (or even better a job that runs a SSIS package to get easy multi-thread processing for free) is the best answer I've found so far. Only one thing to track (the job), a simple table to queue and monitor the processing, and lots more peaceful nights!

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •