Results 1 to 7 of 7
  1. #1
    Join Date
    Nov 2004
    Posts
    1,427
    Provided Answers: 4

    Unanswered: execute a stored procedure set-based

    I had problems finding a good title for this question. Here is the problem I am facing.

    I am writing a number of data migration scripts, like the (simplified version) below
    Code:
    INSERT INTO newDatabase.dbo.DaDestinationTable(col1, col2, col3)
    SELECT col1,
    	col27,
    	col13
    FROM OldDatabase.dbo.DaSourceTable
    There are no validation rules defined in the database (don't ask why), only in the program code. Because of the absence of validations in the database, the data migration scripts could violate business rules (NOT NULL, FK violations, domain values in a column, ...).

    A solution could be that both the program and the data migration call the same stored procedure to insert records in the destination table, so the validation rules must only be defined in one place (the stored procedure).

    I have never used stored procedures in a "set-based" situation. Is this even possible? The only solution I see is to call those sp's in a loop (WHILE or CURSOR). Something like
    Code:
    SET @my_id = -1
    
    SELECT @my_id = MIN(id) 
    FROM OldDatabase.dbo.DaSourceTable
    WHERE id > @my_id
    
    WHILE @my_id IS NOT NULL
    BEGIN
    --	PRINT '@my_id = ' + CAST(@my_id as VARCHAR(10))
    	
    	SELECT @col1 = col1,
    		@col2 = col2,
    		@col3 = col3
    	FROM OldDatabase.dbo.DaSourceTable
    	WHERE id = @my_id
    
    	EXEC usp_Insert_DaDestinationTable @attCol1 = @col1, 
    					@attCol2 = @col2, 
    					@attCol3 =  @col3 
    
    	SELECT @my_id = MIN(id) 
    	FROM OldDatabase.dbo.DaSourceTable
    	WHERE id > @my_id
    END
    For each record, the table OldDatabase.dbo.DaSourceTable is hit twice, once to get the next id (@my_id), once to get the column values. I could combine both with one SELECT script by using ROW_NUMBER(), but I doubt that that will perform faster.

    I only work set-based, I have barely ever used a WHILE loop, let alone a CURSOR. I have no idea what the consequences will be on performance. There are millions of records that have to be migrated.

    I was thinking of CROSS APPLY, but you can't use that in combination with a SP.

    Any ideas?
    With kind regards . . . . . SQL Server 2000/2005/2012
    Wim

    Grabel's Law: 2 is not equal to 3 -- not even for very large values of 2.
    Pat Phelan's Law: 2 very definitely CAN equal 3 -- in at least two programming languages

  2. #2
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    When used this way, stored procedures are a remnant of the "unit/record" processing model. There is no way to make them set-oriented. The performance will be abysmal compared to a set-oriented solution.

    I would prefer to create constraints (CHECK and FOREIGN KEY) that will do the data validation that your business requires, but these constraints may break your existing code that depends on the ability to violate them (even if only for very short periods of time).

    If constraints can't be used, you could use triggers so that you could do a set-based INSERT and then validate the INSERTED view within the trigger. This is Transact-SQL based, so it can be as flexible as a stored procedure and perhaps more so due to some of the special features (like the INSERTED view) that are only available inside of a trigger.

    If nothing else will do, then you can code and execute a stored procedure for each of the millions of rows... This will be painful to write, and even more painful to run.

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  3. #3
    Join Date
    Jan 2003
    Location
    Massachusetts
    Posts
    5,800
    Provided Answers: 11
    For some of the simpler cases, you could look into using table valued parameters. This would essentially allow you to pass the whole recordset into the inserting procedure. It would of course require you to make changes to or clone the stored procedure.

    The cursor may be the better alternative here, as it should give you just the one lookup to the old table per iteration. It is a fairly common problem (and one that I am facing here) that a flat ban on cursors tends to generate even worse solutions, as people try to work around the rules.

  4. #4
    Join Date
    Jan 2007
    Location
    UK
    Posts
    11,434
    Provided Answers: 10
    You won't hear me say this often but... SSIS is good at this!

    SSIS can perform mass inserts and can throw out rows that fail to be inserted (which you can chose what to do with separately in another step).

    Have a play with the basic data transfer steps.

    EDIT: I would create the destination with all the correct constraints, then let SSIS try to insert stuff. Anything that fails the constraints will be spat out in the error path of the transformations.
    George
    Home | Blog

  5. #5
    Join Date
    Nov 2004
    Posts
    1,427
    Provided Answers: 4
    Thank you for your suggestions.

    I am a big fan of securing the database with as many constraints as possible: PK's, unique indexes, FK's, check constraints, NOT NULL, ...

    I have little faith in triggers in general. In DB2 they would sometimes get dropped after changing the table they were defined upon. I learned to monitor them after each and every ALTER TABLE script I ran. I consider triggers as the Dark Matter of a database: you don't see them, but you can deduct that they exist (e.g. when an updated_at column suddenly gets a value, ...). The moment you realise one isn't working any more, you have no idea for how long it's been inactive.

    About SSIS: loading a certain XML file into a staging table with the program takes about half an hour, with SQL (OPENROWSET, see my previous post) it takes 2.5 minutes, with SSIS 11 seconds.
    With kind regards . . . . . SQL Server 2000/2005/2012
    Wim

    Grabel's Law: 2 is not equal to 3 -- not even for very large values of 2.
    Pat Phelan's Law: 2 very definitely CAN equal 3 -- in at least two programming languages

  6. #6
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    I have a SSIS job that runs against "critical" databases. I define "critical" as either prone to "loosing" things (usually due to users playing with things that they shouldn't) or databases that will cause the business serious harm or cost if they fail.

    The job collects information about the schema, and compares it to the cached copy (stored in a couple of tables). If there is a difference, the job emails me. I schedule the job so that it runs every few hours and also randomly one or more times per day.

    It has saved me countless hours when the users start to play with things while I'm not looking!

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  7. #7
    Join Date
    Jan 2007
    Location
    UK
    Posts
    11,434
    Provided Answers: 10
    Very few people should have ALTER and CREATE permissions within a production DB. In fact, only sysadmins should have that ability (inherited), but they shouldn't use it...

    I didn't realise that this was related to your previous XML question. With SSIS there's no need to do every bit of the work using their components. In fact, if you look in my SSIS packages, I would estimate that more than 95% of the steps are Transact-SQL statments/batches. There's no reason you can't still use your existing methods and code and then perform the final insert using SSIS handling.
    George
    Home | Blog

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •