Quote:
Originally Posted by mciasco
Great links man! Very usefull. Thanks a lot.
So it seems that the locking strategy is something provided at application level (by some software layer) and there's no support from the DBMS.
|
The links may be good but they identify the problem is a very narrow context (assuming objects, no ANSI database, application "memory"), thus they are not definitive of either the problem or the solution. Thinks in object/agile terms only, not in database terms.
It is not correct to say what you have stated above.
1 The problem is an application problem (outside the database) and thus it cannot be solved inside the database, it must be resolved in the app code.
2 There IS specific support in the database via the standards provided by ANSI, and those are implemented. You need to understand them and use them. Transaction Isolation Levels mean something other than what you expected it to mean, and it works very nicely. It has to do with ANSI require that the ACID Properties of transactions will be preserved.
3 The locking strategy IS provided by the dbms, but you do need to know how it works; how it is affected by transaction isolation levels; and what it does not do. That is, the dbms
does provide Transaction Isolation and locking, within ANSI standards; but it only does what you tell it to do; nothing more and nothing less. It does not provide knowledge re what has changed in the db, unless you are tracking that.
4 Your problem is simple naοve app design, not knowing about Lost Updates, Phantoms, etc; and therefore not having the code constructs in place to avoid these well-known problems.
The fact is, the database is online, and there is a time lag between data retrieval (select) and data modification (update): if the update is dependent on the data (imagine your example but consider bank accounts where you will be able to withdraw more funds than you actually have), then you must check that at the time of the data modification (what the data was at the time of retrieval is meaningless in a concurrent system).
The time lag may be a few seconds, or hours while the user goes to lunch. It is not permissible to allow locking of the rows (open transaction) while there is user interaction going on.
Optimistic Locking
There are many issues that transactions should avoid by coding standards and constructs, we are dealing with just one issue here (transactions need a fuller treatment).
The links are agile-specific, there is no need for such constructs, detection of "collisions", etc. It is actually much more simple.
Here is the database context for Optimistic Locking.
This has been around for at least 30 years that I have experience with; it predates the Relational model. The information is not copyright, it is basic theory for concurrent systems (if it were, it would be copyright IBM 360 CICS/TPS, circa 1976!).
The trick is to understand that the
state of the data is actually in the data, therefore it should be placed in the database (the database does help), not in an app construct. Of course, the two go hand in hand.
Requirement
In every transactional row, include a TimeStamp column. An alternative is a Version column, which is incremented with each update. Often, a TimeStamp is required for other reasons (such as actual requirement to identify when the row was last changed; on standard-compliant systems, it is demanded, along with a last updated UserStamp).
UserIds and Global Ids are not valid alternatives (implementation of such would produce large code segments and incur additional overhead from the Global Id generator; all quite unnecessary).
The chronology is:
User Interaction via App (client based, Non-transactional; no locks held)
-
SELECT table
____WHERE PK = desired.PK AT ISOLATION 1
--- Read Committed (you want data that is reliable; not dirty, uncommitted)
- app stores the TimeStamp for each row saved in app memory
- user/app works with data (takes their time)
- more selects, more work
- when ready, user/app submits a transaction
Transaction Code (server based, no user interaction [possibility of delay!]; locks held):
- hopefully the code is atomic as well
Prepare_Block
-
SELECT table
____WHERE PK = saved.PK and TimeStamp = saved.TimeStamp AT ISOLATION 3
--- Serialised or Read Committed plus lock until end of transaction
--- if row not found, exit with "Data has changed between retrieval and update" error message
...
Update_Block
BEGIN TRAN
UPDATE table SET
________column = value,
________TimeStamp = current.server.DateTime,
________ ...
____WHERE PK = saved.PK and TimeStamp = saved.TimeStamp
...
COMMIT TRAN
This code construct provides superior performance to the author's (it is undesirable to obtain locks or "write lock" rows that you later find out you cannot update).
It is called Optimistic Locking because it takes the optimistic view that in the vast majority of cases, the data will
not have changed between retrieval and modification; and effectively places the burden only on the small majority in which has changed.
There is no technical term as "pessimistic locking"; sure, it is understood in contrast to Optimistic locking, but different people mean different exact things by it, such as the author does here. Locking and holding the rows whilst user interaction is taking place is the definition of a single user system; it is not simply that it "does not scale", it disallows concurrency while the user has coffee or worse.
Unfortunately Sybase use the term Optimistic Locking to describe certain product features, and that too, is false, it means something other than the standard meaning.
The problem is not a "collision", the transactions did not collide, the second is merely an attempt based on stale data; it is "the data has changed between data retrieval and the data modification attempt" and the user who based their intended transaction on the initial data can figure out what they want to do, themselves. It is always the "fault" of the slower (second) transaction (the first transaction succeeded, and did it faster).
There are no "Collision resolution" options, just post the error msg, fetch the data again and return to the user.
There is another coding standard that is important here as it reduces (minimises) the incidence of the Lost Update problem (as well as affecting overall performance and limiting the scope (isolation) of DDL changes):
----------
a Always select only the columns that you need
b Always update only the columns that you change.
That second item is hard to do and most people don't bother. However, if Optimistic Locking is implemented, it takes care of the problem, and therefore the requirement [b] does not have to be strictly adhered to.