Unanswered: Query in stored procedure runs too fast
Ok, I'll admit right off the bat that I never suspected that I'd ever raise this complaint, much less worry about how to fix the "problem" associated with it!
We're preparing to take a large set of changes (projects) to PeopleSoft Financials from development to test. The code is still somewhat rough, but it has been "desk checked" to ensure that it does what the developers think that it ought to do, and they've blessed it at that point. The code is now moving into the test phase, and the QA team is finding locking/blocking issues that we've never seen in this code before... Sort of a "lock avalanche" where no one process locks for very long, but many of them block one another to the point where applications actually "freeze" while almost never hitting a deadlock.
My solution was to create a "blitzkrieg" query / stored procedure that would periodically sample master.dbo.sysprocesses, master.dbo.sysdatabases, and apply one of the dm_ functions to gather information on locking, blocking, and deadlocking. My procedure runs nicely (it never hangs) and gets about 99.3% of the data that I want.
The problem is that the blasted query / stored procedure runs either too fast or too slow, depending on how you look at it. Because the dm_ function takes a few ms to run, there can be a situation where either a row appears as a false positive or as a missing row because of timing... Either the culprit shows up as a blocker, but by the time the victim spid is evaluated the block has cleared, or the row is skipped and by the time the victim is evaluated the block has occured.
The whole process runs in well under 100 ms when there is nothing to report, and I've never seen it run 200 ms yet under the worst conditions it has faced, so the code is fast... The problem is that I really don't want to try to enforce any kind of locking to resolve the issue, because that locking would impact performance and that is EXACTLY what I do NOT want to do.
One notion that I'm playing with is to harvest everything but the text in one pass (an INSERT), then get the command text in a separate pass (UPDATE). The down side to this idea is that this stuff is blindingly fast, so even the short pause between the INSERT and subsequent UPDATE operations might cause me to actually record the wrong command text (that has already happened in this particular case).
You may be able to use the most_recent_sql_handle column in sys.dm_exec_connections to filter out new commands. If the current most_recent_sql_handle does not equal the harvested most_recent_sql_handle, then you should discard the result. Ideally, you would collect all the information at once. When i was trying to get at some other waittypes (disk waits as I recall) in SQL 2000, i had this same problem, because I was harvesting the SPIDS, then looping back through them to collect the commands. Like you have found, it works 90+% of the time.
This is what I am working with in SQL 2005. sys.dm_exec_requests only shows running requests, and I have not had a lot of chances to play with blocking situations, yet (OK, I am too lazy to force a blockinig situation)
select r.session_id, r.blocking_session_id, r.wait_type, s.text
from sys.dm_exec_requests r cross apply
where r.session_id in (select blocking_session_id from sys.dm_exec_requests)
or blocking_session_id > 0
Does this come anywhere near what you need, or is this what you also started with?
I'm actually starting to think that 99.3% is as good as I can get. One orphan or widow row in every 150 or so isn't going to kill me, and this process is so snarky fast that the only other way to see it is to use SQL Profiler, and Profiler is often just enough load to completely cloak this problem (so it becomes a Heisenbug).