Page 1 of 2 12 LastLast
Results 1 to 15 of 24
  1. #1
    Join Date
    Feb 2009
    Posts
    33

    Unanswered: Update employee ID throughout database

    I have a database that has dozens of tables. Many of these tables reference the employee ID.

    For example tblDaysOff has a column employeeID that is matched on tblEmployees.ID, and there are many such tables.

    Now the employee IDs are changing the way they are generated. Instead of a alphanumeric value being stored as a text value, all employee IDs will be uniqueidentifiers stored as text values.

    The question is, how can I change every instance of "somevalue" in every record in every column where the column name is "employeeID" in every table in the database to "differentvalue" where employeeID = "somevalue"?

    This is what I have cobbled together from multiple sources ... but there is a syntax error where @max is located.

    Code:
    USE CsDB
    
    DECLARE @t TABLE(tRow int identity(1, 1), tSchemaName nvarchar(max), tTableName nvarchar(max))
    
    INSERT INTO @t
     SELECT SCHEMA_NAME(schema_id), t.name
     FROM sys.tables AS t
     JOIN sys.columns c ON c.object_id = t.object_id
     WHERE c.name LIKE '%employeeID%'
     
     DECLARE @max int, @i int
     SELECT @i = 1, @max = MAX(tRow) IN @t
     
     DECLARE @sql nvarchar(max)
     DECLARE @params nvarchar(max)
     
     SET @params = N'@newValue varchar(36)
    		@oldValue varchar(36)'
     
     WHILE @i <= @max
     BEGIN
     SET @sql = N'UPDATE [' +
     (SELECT tSchemaName FROM @t WHERE tRow = @i) +
     N'].[' +
     (SELECT tTableName FROM @t WHERE tRow = @i) +
     N'] SET employeeID = @newValue
         WHERE employeeID = @oldValue'
     
     EXEC sp_executesql @sql, @params, @newValue, @oldValue
     
     SET @i = @i + 1
     END
    Obviously I don't want to run this and then have to try and recover the database when things go awry .. I would like to get it right .. only I am not fluent with SQL.

  2. #2
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Provided Answers: 1
    Good God, don't run that.
    Add a column to your employees table named "OldID'.
    Populate it with the current ID values.
    Generate new GUIDs in the ID column.
    Then run this script for each subtable:

    update TargetTable
    set EmployeeID = Employees.ID
    from [YourChildTable] as TargetTable
    inner join Employees on TargetTable.EmployeeID = Employees.OldID
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

  3. #3
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Provided Answers: 1
    Quote Originally Posted by keb1965 View Post
    Obviously I don't want to run this and then have to try and recover the database when things go awry .. I would like to get it right .. only I am not fluent with SQL.
    Obviously you are going to run this several times in a test environment, right?

    Right????
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

  4. #4
    Join Date
    Feb 2009
    Posts
    33
    The problem is actually deeper than just an employee, we have hundreds of other items that must similarly be changed. In total I have 1753 tables to deal with spread across 17 databases. Eveything is being merged into one and that is why we need something other than simply doing one table at a time.

    What I need to do is get the schema from the database, identify every column that is employeeID and match each employeeID value to a uniqueidentifier that will be generated programmatically. i.e. we will supply a uniqueidentifier that will replace the employeeID in the database, but every occurence of employeeID must be changed in every table otherwise our stored procedures will not return the correct data.

    and yes, I have duplicated the databases on a local machine for testing. There is no way I would even think about running this in a working environment without testing extensively.

    So, to recap ...
    We will handle each item independently, we don't need to do them all at once. We will start with employeeID.

    1) Select all employees from the employee table as emp
    2) Find all columns in all tables where employeeID column exists
    3) Update ID column in table emp with new employeeID
    4) Update every record on column employeeID with the new employeeID value passed where employeeID is equal to the old ID
    5) Go to next record in table emp
    6) Wash rinse repeat.

    So, can this be done?

  5. #5
    Join Date
    Oct 2009
    Location
    221B Baker St.
    Posts
    486
    One way to do what you want is to unload the data from the tables into external files. Modify the external files for all changes. Then reload all of the data back into the tables.

    Do this until you/the users are satisfied it is Completely working in the test envroment.

    Shut down the production environment. Using a proven backup method, backup the production environment. Empty all of the tables. Reload everything. Verify that all is correct.

    Keeping the OldId could be very useful. . .

  6. #6
    Join Date
    Feb 2009
    Posts
    33
    Dumping the data and editing it is not an option.

    Ths must be done in a live environment.

  7. #7
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Provided Answers: 1
    You can't do this in a live environment.
    The process is going to take hours, during which time relational integrity will be violated and any transactions against the database would be suspect.
    What about creating a new copy of the database with the schema changes you want, but without any data, and then copying the old data into it.
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

  8. #8
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Provided Answers: 1
    As far as repeating this process for other tables goes, you can take my suggestion and write queries against the system object tables to automatically generate scripts for all the tables and foreign keys.
    Doing this, table by table, will be MUCH faster than doing it record by record.
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

  9. #9
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    I would VERY STRONGLY recommend that you store the uniqueidentifier as an alternate key in the employee table, but continue to use an integer as the primary key.

    GUIDs are wonderful things for many reasons and they solve many problems, but they make relatively poor primary keys in SQL Server for a number of different reasons. Jason Strate has recently release Expert Performance Indexing for SQL Server 2012 which explains this in great detail.

    Match the GUID against any incoming requests from the client applications, but use the existing INT PK for joins, etc. You get all of the benefits of using a GUID, but don't have to deal with the problems associated with it!

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  10. #10
    Join Date
    Feb 2009
    Posts
    33
    Lots of things to consider.

    I know running the dangers of running this on the entire system at once, which is why I want to roll it into a stored procedure and only update the items as they are needed.

    For example, when we update an employee's information, the query would run only on that employee. Other employees would be updated as their other records are updated. Essentially what I propose is to check to see if the primary key has been updated. If it has not, then update it and do your other stuff. If it has just do what you came there to do.

    In any given employee record, there are probably a hundred associated records or so in various other tables. We can do this incrementally so the server isn't overwhelmed and the data integrity is preserved for the long haul.

    As far as not using GUIDs .. that isn't my call. I am merely the guy having to make it work. I would not have considered using GUIDs as a primary key.

    The only way I see this working effectively, is to read the schema to find all related columns and related tables, then use that information to update the relevant records in each table. Preliminary tests indicate there are 14 tables that must be updated, but there are child relations that will be updated later, because the data is only archived and is not currently used.

    I do appreciate everyone's comments and discussion.
    I have revised the original SQL to only operate on specific tables and executed it in a test environment several times. The result is quite impressive, although there will be a considerable amount of testing before this is rolled out.

  11. #11
    Join Date
    Oct 2009
    Location
    221B Baker St.
    Posts
    486
    which is why I want to roll it into a stored procedure and only update the items as they are needed.
    This kinda scares me . . . Seems like there would many ways for some inconsistency to get into the data. Spread over weeks or months, how might the original Ever be returned if something was discovered later.

    IMHO, this would be best supported having a "conversion window" - how long this might be depends on the amount of volume to be modified. The time needed would be seen when doing full-volume testing.

  12. #12
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Provided Answers: 1
    No offense, but the whole idea of what you are attempting to do is abhorrent.
    How big is your database? You're seriously going to operate in an environment where some of your data has been converted to a new schema, and the rest has not? Were some of your data uses GUIDs and some use integers, within the same table? Where transactional or OLAP operations are put on hold while your code loops through an employee record to update related tables?
    What if you run a transaction that affects, say, 1000 employee records? You're going to make that transaction wait until your code has been executed 1000 times, each time looping through multiple foreign keys and performing single-record updates?
    Add a pinch of paprika, broil 30 minutes, and you have a recipe for disaster that serves your entire IT department.
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

  13. #13
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Provided Answers: 1
    "Now the employee IDs are changing the way they are generated. Instead of a alphanumeric value being stored as a text value, all employee IDs will be uniqueidentifiers stored as text values."
    And why do you need to convert the old data to GUIDs anyway? Both are being stored as text. Presumably the old ones are unique already, right? Presumably none of the old IDs look even vaguely like GUIDs, so they won't be duplicated by any new records that are created, right?
    You understand that GUIDs are surrogate keys, which be definition have no intrinsic relation to the underlying data they identify. It should not matter if you keep your old EmployeeIDs, so long as you don't change the cardinality and you're not changing the datatype.
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

  14. #14
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    May I again VERY STRONGLY suggest at least doing a backup of the working system before you start making changes?

    What you are contemplating is a very, very bad idea for many reasons. You appear to be determined to follow it through, and that is certainly your choice. You'll be a lot better off if you have a good backup before you start, for many reasons.

    For what it is worth, you can store nearly anything as text (in this case I hope you mean VARCHAR). That does not make it a good idea. Give some serious thought to what you are doing.

    Consider reading a good book on SQL Server, or hiring an experienced consultant... That may save you a lot of work and maybe some serious heartache too!

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  15. #15
    Join Date
    Oct 2009
    Location
    221B Baker St.
    Posts
    486
    You'll be a lot better off if you have a good backup before you start, for many reasons.
    Definitely!

    However, even if a good backup is taken before this starts and is to continue over weeks or months, and at some late date something critical is discovered, how will you be able to use the original backup? Yup, backups would still be taken regularly, but a do-over might be quite problematic.

    I'd still push for a "conversion window" and reduce/eliminate this risk.

    fwiw

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •