Results 1 to 11 of 11
  1. #1
    Join Date
    Jan 2009
    Posts
    7

    Unanswered: Highly transactional updates

    Hello, this is my first post to this forum and it is more a question on best practices than a request for help on a specific problem.

    I have a highly transactional system, with lots of separate processes reading and updating data on a Sybase database. There is one particular table balances, used by several different processes; this table can be understood as a set of account balances, with a structure like this:

    • id: primary key (technical)
    • account: id for the account; unique key
    • balance: current balance for the account


    I have several processes (I will call them "kids", because there are many of them and they tend to have a small duration) that, simultaneously, move a certain amount from one account X to another account Y, doing something equivalent to:

    begin transaction
    update balances set balance=balance-amount where account=X
    update balances set balance=balance+amount where account=Y
    commit

    I also have one process (I will call it "daddy" because there is only one and it takes longer to execute) that does batches of these moves; these batches must be executed atomically or not at all:

    begin transaction
    update balances set balance=balance-amount1 where account=X1
    update balances set balance=balance+amount1 where account=Y1

    update balances set balance=balance-amount2 where account=X2
    update balances set balance=balance+amount2 where account=Y2

    update balances set balance=balance-amount3 where account=X3
    update balances set balance=balance+amount3 where account=Y3

    ...
    commit

    All of these processes (kids and daddy) might be triggered simultaneously at different times of the day.

    The problem I see with this setup is that it is ripe for all kinds of locks and deadlocks (which is the case in reality, I might add). I have the nagging feeling that daddy should just ask for a lock on the whole table at the beginning, so that kids won't be bothering it while it does its thing:

    begin transaction
    lock table balances in exclusive mode;

    update balances set balance=balance-amount1 where account=X1
    update balances set balance=balance+amount1 where account=Y1

    update balances set balance=balance-amount2 where account=X2
    update balances set balance=balance+amount2 where account=Y2

    update balances set balance=balance-amount3 where account=X3
    update balances set balance=balance+amount3 where account=Y3

    ...
    commit

    The DBA looks at me like he wants to murder me for suggesting this, but I really think it would help with the locks / deadlocks. I would love to hear some feedback from the experts. How do you handle this kind of concurrency?

    Thanks and best regards.
    Last edited by gonzus; 01-20-09 at 13:39.

  2. #2
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    Just a few questions to give us all a few more clues:
    • How many records in the balances table?
    • How many transactions per hour (max)?
    • What are all the indexes on balances?
    • How often are the statistics updated?
    • What type of locking is being used?
    • Do the kid processes deadlock with each other or just the daddy?
    • How often do you get deadlocks?
    • Do the deadlocks happen at a certain time of day (ie during batches)?

  3. #3
    Join Date
    Jan 2009
    Posts
    7
    Thanks for answering, will get back to you with detailed answers. Best regards.

  4. #4
    Join Date
    Jan 2009
    Posts
    7
    Ok, answers for some of your questions.

    Quote Originally Posted by mike_bike_kite
    How many records in the balances table?
    It has 260,000 records.

    Quote Originally Posted by mike_bike_kite
    What are all the indexes on balances?
    The table has the following columns:
    • account id
    • instrument type
    • date
    • balance


    The table has two indexes on these columns:

    • account id, instrument type (clustered, unique)
    • instrument type, account id (nonclustered, unique)


    Quote Originally Posted by mike_bike_kite
    How often are the statistics updated?
    Every day, but on different groups of tables. Every table on the database should get its stats updated at least once a month.

    Quote Originally Posted by mike_bike_kite
    What type of locking is being used?
    Datarow locking scheme.

    Quote Originally Posted by mike_bike_kite
    • How many transactions per hour (max)?
    • Do the kid processes deadlock with each other or just the daddy?
    • How often do you get deadlocks?
    • Do the deadlocks happen at a certain time of day (ie during batches)?
    Still finding out, will get back.

    Thanks again.

  5. #5
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    I'm sure you know why deadlocks occur ie one tran will lock A and then try to lock B while another tran has already locked B and now is trying to lock A - both processes end up getting locked. A few ideas spring to mind:

    Idea 1: A simple solution might be to alter your code by ordering the updates so they are done in account_id order. In the example above both transactions would try to lock A before B and so there'd be no deadlock - the first transaction to lock A would complete it's update and then the second transaction would start. This change would have to be done in both the kids and the daddy job. I can't guarantee this will work but I think it should.

    Idea 2: If the kid processes are always locking with the daddy process and not with each other and it's just that the daddy process needs to pick a better time to run then you could add a wait loop to the daddy process to check that there haven't been any kids running recently before running. Of course this doesn't guarantee no deadlocks but it will tend to avoid them. If you have kids running throughout the day with no downtime at all then you'd need to break out of the wait loop and run after a predetermined time.

    Idea 3: A solution requiring more effort would be to just put the transactions into a queue table (from_account, to_account, amount) rather than do the updates directly. There won't be any deadlocks on this table as processes will be atomic. Then have a separate job to read the queue table and do the real transfer using 2 updates in a transaction. As you only have one process doing all the updates then you won't get any deadlocks.

    Idea 4: If the deadlocks only bother you and not the business then you could just monitor the processes for killed jobs and automatically restart them.

  6. #6
    Join Date
    Jan 2009
    Posts
    7
    Quote Originally Posted by mike_bike_kite
    Idea 1: A simple solution might be to alter your code by ordering the updates so they are done in account_id order. In the example above both transactions would try to lock A before B and so there'd be no deadlock - the first transaction to lock A would complete it's update and then the second transaction would start. This change would have to be done in both the kids and the daddy job. I can't guarantee this will work but I think it should.
    Unluckily, this is not possible. The daddy process represents what's called a gross settlement system, so it has a few requirements of its own:

    • The updates must be performed in the specified order.
    • For each daddy process, all of the updates must be performed or none of them should.
    • For each daddy process there is an SLA, so these things cannot take too long. With today's approach (that is, not taking any explicit locks), sometimes they do take too long, and this becomes a business (and money) issue.


    Quote Originally Posted by mike_bike_kite
    Idea 2: If the kid processes are always locking with the daddy process and not with each other and it's just that the daddy process needs to pick a better time to run then you could add a wait loop to the daddy process to check that there haven't been any kids running recently before running. Of course this doesn't guarantee no deadlocks but it will tend to avoid them. If you have kids running throughout the day with no downtime at all then you'd need to break out of the wait loop and run after a predetermined time.
    This does not seem a sure safe solution. Besides, the daddy process has SLAs associated with it, so it should take precedence.

    Quote Originally Posted by mike_bike_kite
    Idea 3: A solution requiring more effort would be to just put the transactions into a queue table (from_account, to_account, amount) rather than do the updates directly. There won't be any deadlocks on this table as processes will be atomic. Then have a separate job to read the queue table and do the real transfer using 2 updates in a transaction. As you only have one process doing all the updates then you won't get any deadlocks.
    We have toyed with this idea, isolating the update processing into a separate service. This would probably work, but it has disadvantages as well; one of the things I don't like about the idea is that it would stop us from having the ability to do quick fixes on the database by just running SQL queries on it (because that could wreak havoc on the carefully queued updates and locks).

    Quote Originally Posted by mike_bike_kite
    Idea 4: If the deadlocks only bother you and not the business then you could just monitor the processes for killed jobs and automatically restart them.
    This, as seen from my answer to Idea 1, is not the case.

    We also came up with another possibility, which I outline in awful pseudo-code next:

    // Put the ids for all accounts that are involved in an operation (whether for daddy or kids) into a table:
    uid = next_unique_id();
    for each update in the operation do
    insert into operations (update_id, account_id) values (uid, account_id);
    done

    // Now we should be able to lock all of the appropriate records in balances for this operation
    // and we should be able to control the order in which these locks are taken:
    create cursor cb as select B.id, B.balance, B.account, O.account_id from balances B, operations O
    where B.account = O.account_id and O.update_id = uid
    ORDER BY O.account_id FOR UPDATE WITH LOCK AND PLEASE MAKE SURE THIS WORKS;
    oper cursor cb;
    for each row fetched from cursor cb do
    // do magic
    done
    close cursor cb;


    But I have several questions about this approach as well:
    • Does the ORDER BY for the cursor work? Can I be assured the locks will be acquired in that order?
    • When will the server acquire the locks? It could be that it acquires ALL of the locks when the cursor is opened, or maybe it will acquire them one by one when each fetch is executed.
    • I still must process the operations for the daddy process in the specified order (the "do magic" part), which is different from the lock acquisition order, and it is not obvious how this could be done properly.
    • When will the server release the locks? If it releases all of them when the cursor is closed, then maybe the loop could be:


    oper cursor cb;
    for each row fetched from cursor cb do
    // do nothing
    done

    // now all appropriate rows have locks on them, acquired in a strict order.
    // we can do all the updates desired, in any order we want.
    close cursor cb;


    It seems like this could work, but it looks a tad complex, and I am not sure it buys us anything over just taking a table lock on balances (as I initially asked).

    I would love to hear comments. Thanks and best regards.

  7. #7
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    Quote Originally Posted by gonzus
    (Idea 1) The updates must be performed in the specified order.
    if it's in a tran and the updates are as you described then I can't see the difference. If you do the updates in account_id order within the transaction then you shouldn't get locks which should fix your problem.

    Quote Originally Posted by gonzus
    (Idea 1) For each daddy process, all of the updates must be performed or none of them should.
    if it's in a tran then this should still be the case (same for kid processes).

    Quote Originally Posted by gonzus
    (Idea 1) For each daddy process there is an SLA, so these things cannot take too long. With today's approach (that is, not taking any explicit locks), sometimes they do take too long, and this becomes a business (and money) issue.
    If it's taking too long because of conflicting locks then your priority should be to remove these. If it's because doing the work in the time allowed is just not possible then either your indexes/SQL need to be looked at or your SLA's need changing.

    In either case this isn't a bad thing as you can use the broken SLA as a strong argument to make any changes you wish to put forward.

    Quote Originally Posted by gonzus
    (Idea 2) This does not seem a sure safe solution.
    it's just the same solution as you have at the moment but with the option of a delayed start if the server is busy. However the new SLA issue now rules that out.
    Quote Originally Posted by gonzus
    (Idea 3) I don't like about the idea is that it would stop us from having the ability to do quick fixes on the database by just running SQL queries on it (because that could wreak havoc on the carefully queued updates and locks).
    Running quick fixes on the database sounds scary to me - if you have to do that more than once then it sounds like there serious issues with your processes in general.

  8. #8
    Join Date
    Jan 2009
    Posts
    7
    Quote Originally Posted by mike_bike_kite
    if it's in a tran and the updates are as you described then I can't see the difference. If you do the updates in account_id order within the transaction then you shouldn't get locks which should fix your problem.
    I left out a minor detail that is important for this part of the analysis: when doing an update that subtracts from the balance, it should fail if the balance is not enough; this should make the whole transaction fail. Hence, the order for the updates (specially in the case for daddy) is important, because previous updates could increase a balance and make the whole set of updates succeed. Sorry for not explaining this, I didn't want to get mired into fine details.

    Quote Originally Posted by mike_bike_kite
    If it's taking too long because of conflicting locks then your priority should be to remove these. If it's because doing the work in the time allowed is just not possible then either your indexes/SQL need to be looked at or your SLA's need changing.
    It is taking too long because of conflicting locks, and I am trying to remove these. The question is how to achieve this...

    Quote Originally Posted by mike_bike_kite
    Running quick fixes on the database sounds scary to me - if you have to do that more than once then it sounds like there serious issues with your processes in general.
    Yes, it is not very often we have to do this, but it is good to be able to when absolutely necessary.

    You didn't offer any comments on the cursor thing I outlined, nor answered the questions regarding when those locks are acquired and in which order. Could you comment on that?

  9. #9
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    Quote Originally Posted by gonzus
    I left out a minor detail that is important for this part of the analysis: when doing an update that subtracts from the balance, it should fail if the balance is not enough; this should make the whole transaction fail.
    That's beginning to sound like a major detail to me
    So the following kid transactions are not the same?
    Code:
    begin tran
       update balances set balance=balance-amount1 where account=X1
       update balances set balance=balance+amount1 where account=Y1
    commit tran
    
    begin tran
       update balances set balance=balance+amount1 where account=Y1
       update balances set balance=balance-amount1 where account=X1
    commit tran
    If they are the same then just make sure the kid transactions are done in account order and that would avoid kids deadlocking with other kids but it won't stop kids deadlocking with the daddy.

    Quote Originally Posted by gonzus
    It is taking too long because of conflicting locks
    It may also be worth seeing if you can lock all the records at the start of the daddy process to ensure you won't get contention later on :
    Code:
    begin tran 
       update balances 
       set    time_last_altered = getdate()
       where  account_id in ( X1,Y1,A1,B1 etc )
    If this solves your issue then it may also be worth doing this in the kid processes as well. If it doesn't then I'd go with a queuing table (idea 3) as it will guarantee to avoid all locks.

    At the same time I'd look hard at why you need to do quick fixes on the database and see if these are really necessary or whether the queue table really makes things more difficult.

    Quote Originally Posted by gonzus
    You didn't offer any comments on the cursor thing I outlined, nor answered the questions regarding when those locks are acquired and in which order
    I just don't like cursors - they always seem to be at the root of a problem rather than the solution to one. I didn't know the answers your other questions
    Quote Originally Posted by mike_bike_kite
    • How many transactions per hour (max)?
    • Do the kid processes deadlock with each other or just the daddy?
    • How often do you get deadlocks?
    • Do the deadlocks happen at a certain time of day (ie during batches)?
    It would still be good to answer these questions if at all possible.
    Last edited by mike_bike_kite; 01-24-09 at 09:20.

  10. #10
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    Did you manage to solve the problem in the end?

  11. #11
    Join Date
    Jan 2009
    Posts
    7
    I'm sorry, I am away on holiday until February 9th. But I am feeling more and more inclined to a centralized solution, akin to your suggestion of using a table to hold the pending update operations and having a single process that issues the effective updates.

    Thanks and best regards.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •