Unanswered: Strange situation with Windows Authentication and undocumented function GET_SID
SQL Server 2000
Windows 2000 Advanced Server
We recently moved our servers from one domain to another. Now, we can't grant Windows users access to any databases. We CAN create the user IDs successfully via Enterprise Manager, and get a corresponding row in MASTER..SYSXLOGINS, but can't click on the Database Access tab in the SQL Server Login Properties dialog and grant accesss to any databases. When we try to do so, we get error 15401, "Windows user or group xxx not found". The underlying call is to MASTER..SP_GRANTDBACCESS, and running that via Query Analyzer returns the same error (naturally)
Looking through the code of SP_GRANTDBACCESS, I've determined that what is failing is a call to the undocumented TSQL function GET_SID. This proc takes two parameters, the first is either \G<nt group name> or \U<nt user name> and the second is NULL in the call in SP_GRANTDBACCESS. If I execute
SELECT GET_SID('\U<valid user>', NULL)
it returns NULL, however, if I run
SELECT SUSER_SID(<valid user>)
then I get the Windows SID of whatever valid user name I supply.
We have 3 servers in question, namely production, development and test. We noticed the problem on production. Curiously enough, development and test worked fine.
NOW THE PLOT THICKENS. If I run this query on the dev box....
SELECT SUSER_SID(<id>), GET_SID('\U<id>',NULL)
...the first function returned the SID, BUT THE SECOND FUNCTION DID NOT!!! How could that be? Clearly GET_SID was working inside of SP_GRANTDBACCESS, but not as a discrete call. So I went into the master database and added code to print out the SID returned by GET_SID to the proc. Lo and behold, SP_GRANTDBACCESS promptly failed with a 15401 error. It continues to get 15401s now, even after I returned it to the original code. What gives? Now my dev box has the same error production has, and all I did was recompile SP_GRANTDBACCESS a couple of times. FWIW, I did *not* ever make any changes to SP_GRANTDBACCESS on production.
Why doesn't GET_SID() work outside of SP_GRANTDBACCESS?
Why did recompiling SP_GRANTDBACCESS break it permanently?
It almost seems like the query compiler can't correctly link a call to GET_SID to the correct function in some DLL, except I thought that the compiled code didn't survive a restart, and all TSQL procs were recompiled the first time they were called after a restart. If that's the case, then the compiler is - or at least was - producing a functioning compiled version of SP_GRANTDBACCESS after every restart.
sp_change_users_login 'auto_fix' only works with SQL Server based IDs, not Windows based IDs. Besides that, the problem isn't orphaned rows, as before I started this I dropped the logins from the previous domain and thus there aren't any rows in sysusers for these accounts.
Actually, there is an easy workaround. I created a duplicate of sp_grantdbaccess that uses SUSER_SID() instead of GET_SID() and it works. It requires an additional parameter to indicate whether the Windows ID is a group or an individual, something you could determine with GET_SID() but can't with SUSER_SID(). But I'm curious why this proc is so fragile. Something is surely wrong somewhere.
This seems closely related to another problem I had with these boxes - namely I could not make *any* changes to sp_password - even as simple as just recompiling the default code - without it failing by saying for every user that they did not have the authority to execute the command DBCC AUDITEVENT. The only way to recover was to restore the master DB.