Results 1 to 5 of 5
  1. #1
    Join Date
    Mar 2009
    Posts
    15

    Concurrent UPDATE and isolation level

    I'm wondering about a strange situation...
    Suppose to have an application to manage customers informations. This application is just a simple gui where I can read and modify one customer at a time (peraphs with next e previous buttons to navigate through customers...). Suppose two users want to modify the same customer (thus the same customer's record!) at the same time. So we have two UPDATE transactions.

    TRANSACTION 1 (T1): First user try to modify the customer's age from '30' to '31' (happy birthday!)

    TRANSACTION 1 (T2): Second user try to modify the customer's phone from '555' to '666'.

    Here there are two possible errors if, at the and of T1 and T2, the customer record shows a mix of the two updates (age=31 and phone=555 or age=30 and phone=666).

    In theory, to avoid these errors we have to select at least the READ COMMITED isolation level (the default level in most DBMS). But I can't figured it out how it can works really.
    Both users, first select the same customer and the GUI's fields are filled with values. These values are the same for both users, so both see customer's age set to 30 and his phone number set to 555 (the original values). The user1 modifies the age's field to 31, while user2 modifies the phone's field to 666.
    Finally they hit the save button on the screen and the two transactions T1 and T2 start... but there's no way to avoid the mixing of the two updates. If T1 commits first the customer will have age=30 and phone=666 because the last valid update comes from T2 that commited after T1. Otherwise T2 commits first, and the customer will have age=30 and phone=555, cause the last valid update comes from T1 that commited after T2.

    I'm sure there's something wrong with my understanding... but where?

  2. #2
    Join Date
    Sep 2002
    Location
    UK
    Posts
    5,171
    This is called the "lost update problem", and there are two different ways to avoid it: optimistic locking and pessimistic locking.

    Pessimistic locking works in a stateful environment: the application takes out a lock on the record when it is selected so that no one else can update it until the first user has committed or rolled back - thus guaranteeing that the first user's update does not overwrite anyone else's changes.

    Optimistic locking is the only possibility in a stateless environment like the web: the application obtains information about the record when it is first selected by the user, and then ensures that that information is still the same at the point of performing the update. This information may be the entire value of the record, or a checksum based on it, or a record version number, or an update timestamp, or some other scheme devised by the database designer.

  3. #3
    Join Date
    Mar 2009
    Posts
    15
    Quote Originally Posted by andrewst View Post
    This is called the "lost update problem", and there are two different ways to avoid it: optimistic locking and pessimistic locking.

    Pessimistic locking works in a stateful environment: the application takes out a lock on the record when it is selected so that no one else can update it until the first user has committed or rolled back - thus guaranteeing that the first user's update does not overwrite anyone else's changes.

    Optimistic locking is the only possibility in a stateless environment like the web: the application obtains information about the record when it is first selected by the user, and then ensures that that information is still the same at the point of performing the update. This information may be the entire value of the record, or a checksum based on it, or a record version number, or an update timestamp, or some other scheme devised by the database designer.
    Great links man! Very usefull. Thanks a lot.
    So it seems that the locking strategy is something provided at application level (by some software layer) and there's no support from the DBMS.

  4. #4
    Join Date
    Sep 2002
    Location
    UK
    Posts
    5,171
    That's right, but I don't think it is because the DBMS is being unhelpful - when you tell it to update a row and set all its columns to values you supply, it doesn't know whether you are doing that in a deliberate and knowing attempt to overwrite any changes someone else recently made, or whether you are unaware and don't really mean to change some of the columns. Only your code knows where it got the values from, and which ones it really wants to change.

  5. #5
    Join Date
    Mar 2009
    Location
    Sydney, Australia
    Posts
    258
    Quote Originally Posted by mciasco View Post
    Great links man! Very usefull. Thanks a lot.
    So it seems that the locking strategy is something provided at application level (by some software layer) and there's no support from the DBMS.
    The links may be good but they identify the problem is a very narrow context (assuming objects, no ANSI database, application "memory"), thus they are not definitive of either the problem or the solution. Thinks in object/agile terms only, not in database terms.

    It is not correct to say what you have stated above.

    1 The problem is an application problem (outside the database) and thus it cannot be solved inside the database, it must be resolved in the app code.

    2 There IS specific support in the database via the standards provided by ANSI, and those are implemented. You need to understand them and use them. Transaction Isolation Levels mean something other than what you expected it to mean, and it works very nicely. It has to do with ANSI require that the ACID Properties of transactions will be preserved.

    3 The locking strategy IS provided by the dbms, but you do need to know how it works; how it is affected by transaction isolation levels; and what it does not do. That is, the dbms does provide Transaction Isolation and locking, within ANSI standards; but it only does what you tell it to do; nothing more and nothing less. It does not provide knowledge re what has changed in the db, unless you are tracking that.

    4 Your problem is simple naοve app design, not knowing about Lost Updates, Phantoms, etc; and therefore not having the code constructs in place to avoid these well-known problems.

    • The fact is, the database is online, and there is a time lag between data retrieval (select) and data modification (update): if the update is dependent on the data (imagine your example but consider bank accounts where you will be able to withdraw more funds than you actually have), then you must check that at the time of the data modification (what the data was at the time of retrieval is meaningless in a concurrent system).
    • The time lag may be a few seconds, or hours while the user goes to lunch. It is not permissible to allow locking of the rows (open transaction) while there is user interaction going on.

    Optimistic Locking

    • There are many issues that transactions should avoid by coding standards and constructs, we are dealing with just one issue here (transactions need a fuller treatment).
    • The links are agile-specific, there is no need for such constructs, detection of "collisions", etc. It is actually much more simple.
    • Here is the database context for Optimistic Locking.
    This has been around for at least 30 years that I have experience with; it predates the Relational model. The information is not copyright, it is basic theory for concurrent systems (if it were, it would be copyright IBM 360 CICS/TPS, circa 1976!).

    The trick is to understand that the state of the data is actually in the data, therefore it should be placed in the database (the database does help), not in an app construct. Of course, the two go hand in hand.

    Requirement
    In every transactional row, include a TimeStamp column. An alternative is a Version column, which is incremented with each update. Often, a TimeStamp is required for other reasons (such as actual requirement to identify when the row was last changed; on standard-compliant systems, it is demanded, along with a last updated UserStamp).
    • UserIds and Global Ids are not valid alternatives (implementation of such would produce large code segments and incur additional overhead from the Global Id generator; all quite unnecessary).

    The chronology is:
    User Interaction via App (client based, Non-transactional; no locks held)
    - SELECT table
    ____WHERE PK = desired.PK AT ISOLATION 1

    --- Read Committed (you want data that is reliable; not dirty, uncommitted)
    - app stores the TimeStamp for each row saved in app memory
    - user/app works with data (takes their time)
    - more selects, more work
    - when ready, user/app submits a transaction

    Transaction Code (server based, no user interaction [possibility of delay!]; locks held):
    - hopefully the code is atomic as well
    Prepare_Block
    - SELECT table
    ____WHERE PK = saved.PK and TimeStamp = saved.TimeStamp AT ISOLATION 3

    --- Serialised or Read Committed plus lock until end of transaction
    --- if row not found, exit with "Data has changed between retrieval and update" error message
    ...
    Update_Block
    BEGIN TRAN
    UPDATE table SET
    ________column = value,
    ________TimeStamp = current.server.DateTime,
    ________ ...
    ____WHERE PK = saved.PK and TimeStamp = saved.TimeStamp
    ...
    COMMIT TRAN


    • This code construct provides superior performance to the author's (it is undesirable to obtain locks or "write lock" rows that you later find out you cannot update).
    • It is called Optimistic Locking because it takes the optimistic view that in the vast majority of cases, the data will not have changed between retrieval and modification; and effectively places the burden only on the small majority in which has changed.
    • There is no technical term as "pessimistic locking"; sure, it is understood in contrast to Optimistic locking, but different people mean different exact things by it, such as the author does here. Locking and holding the rows whilst user interaction is taking place is the definition of a single user system; it is not simply that it "does not scale", it disallows concurrency while the user has coffee or worse.
    • Unfortunately Sybase use the term Optimistic Locking to describe certain product features, and that too, is false, it means something other than the standard meaning.
    • The problem is not a "collision", the transactions did not collide, the second is merely an attempt based on stale data; it is "the data has changed between data retrieval and the data modification attempt" and the user who based their intended transaction on the initial data can figure out what they want to do, themselves. It is always the "fault" of the slower (second) transaction (the first transaction succeeded, and did it faster).
    • There are no "Collision resolution" options, just post the error msg, fetch the data again and return to the user.

    There is another coding standard that is important here as it reduces (minimises) the incidence of the Lost Update problem (as well as affecting overall performance and limiting the scope (isolation) of DDL changes):

    ----------

    a Always select only the columns that you need
    b Always update only the columns that you change.
    That second item is hard to do and most people don't bother. However, if Optimistic Locking is implemented, it takes care of the problem, and therefore the requirement [b] does not have to be strictly adhered to.
    Last edited by Derek Asirvadem; 11-07-09 at 03:10. Reason: Further Clarity
    Regards
    Derek Asirvadem (Formerly DerekA)
    Information Architect / Senior Sybase DBA
    Copyright © 2009 Software Gems Pty Ltd

    I answer questions from the Original Poster only. If you have a genuine question, as the moderators have requested, start a new thread.

    http://www.softwaregems.com.au

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •