I have this existing table..which has an int column ....say column A. Data in this column serves as the primary key.
Every time I insert a row, I need to able to select the already existing max value in column A and add 1 to it and thereby generate a new number.
I should be able to insert a new row using this new number for the value in Col. A.
Quite simple actually...
Unfortunately the column A, as I said is int ...arggh...wish it was numeric/identity...but alas....its too late to change.
I am now trying to figure out what the simplest and most effective way is, to do this....As I said I cannot change the table definition. Also I need to keep in mind the multi-user environment.
So selecting max+1 and then inserting a new row after that, using the selected value isn't really the ideal solution.
Basically... somehow I need to get this to work similar to the way the numeric/identity fields work, without changing the table.
I am not sure. I remember having a discussion on that bit of code a long time ago but I can't remember. In any case, if you use the code as-is you end up with your data_table keys being one larger than the value reported in next_key.
You should be able to remove the extra +1 and not have any problems; give it a try on your server.
Originally posted by dbcat
That was brilliant...really helps...Thank you!
You might help me with a small doubt...from the tech doc you pointed out.
Under the Next Key Table is the following
1> begin tran
2> declare @next_key
3> update next_key set next_key = next_key + 1
4> select @next_key = next_key + 1
5> from next_key
6> insert into data_table (key_column, ...)
7> values (@next_key, ...)
8> commit tran
Why is there a +1 in the update and the select??
You know, you could define the next_key column to be an identity. Just insert to it, grab @@identity and use that value to insert to data_table. you could even partition the next_key table, or use datarows locking to allow for a lot of concurrent inserts with no contention. The solution you show above effectively serializes, and thus drastically slows down, each request for a new key. You can seed the identity value to the current max in data_table so it picks up at the right starting value.
In your solution, you start a transaction, then update next_key set next_key = next_key + 1.
When you do that, that locks the next_key table which prevents others from doing the same
update at the same time (assuming more than process might be trying to insert a row
to data_table at the same time). That lock is not released until you commit. So the process
of obtaining a new key could become a bottleneck. I do mean one process at a time.
By seeding I guess I was assuming you already had some data in the data_table. If so, and you implement
a new process to obtain keys for that table, you'd need them to start at the right value. Say the max(key)
in there now is 100. You'd need the next key to be 101. I was just saying you could force the value 100
into the identity column in the next_key table to accomplish that. The next row you insert would therefore
get the value 101.
I would probably not do the implicit select max(id column)+1 on the same data-table. Again, if
you expect to have any kind of volume here, meaning several or more concurrent processes trying to
do this at the same time, I don't believe that would work very well.
I agree that the next_key table idea is not for high-transactional situations. Altering the table to include an IDENTITY would be the easiest, and probably 'best' method, but that isn't an option (for some reason).
If you're on a pre-11.9.2 system, remember that you have page-only locking and continually inserting to the next_key table will probably have just as much contention due to last-page locks.
Also, you will need to maintain the next_key table -- by that I mean continually clearing it out so you don't end up with 30 million rows that are, for all intents and purposes, useless.
I am not sure that my situation counts as "highly transactional"
I think we are expecting anywhere between a 100-1000 inserts being attempted on data_table, per day...
Also my system supports datarow locking...so this is definitely something I can make use of.
I am not sure I understand, your last point about continually clearing the next_key table.
Can you also possibly explain flaws...with writing an embedded select
(selecting max+1), directly in the insert statement on the data_table. For what reasons would this be flawed...and/or does it even work in the first place??
Finally does declaring next_key as indentity...avoid potential concurrency issues regarding key generation??
I think the post said they can't use identity for the data_table. The next_key table would be a new table
whose sole purpose is to provide new key values for the data_table and I believe the design for that next_key
table is up in the air.
What version of Sybase is this? You may not have row level locking prior to 11.9, but you have partitions
since 11.x (I think).
And yes, you'd have to periodically prune (delete from or truncate) the next_key table or it will just keepgrowing.
-- other inserts, updates, deletes that are part of this unit of work go here
*** the nice part here is that inserting to next_key can be in a separate transaction than inserting
to data_table. Each insert to next_key is guaranteed to get a different value. No need for an explicit
*** note that if data_table is not designed for concurrency as well, inserting to it could
become the bottleneck here. IOW, it may be that many processes can get new keys at the same time,
but they also need to be able to insert to data_table at the same time since these two
tasks will always be done together. data_table should therefore also be partitioned or use datarows
locking or be clustered on some random value to spread the inserts around. Again, knowing the
version would be helpful here.
Can we not move
insert next_key (datestamp) values (getdate())
select @next_key = @@identity
inside the begin tran block??
You can but you don't need to. Ideally your transactions should be as short as possible. There's no functional need here to put it in the same transaction as the insert to data_table. Doing so will just hold the lock on next_table longer than it has to.
But does using identity actually increase concurrency of different processes being able to generate keys?? Does this solve the serializing problem you mentioned earlier??
Using identity prevents you from having to first select a value, then update, so inherently it's faster. All you are doing is inserting and retreiving @@identity and you're done. The part that really enhances concurrency is the datarows locking method or partitioning.
I would probably not do the implicit select max(id column)+1 on the same data-table. Again, if you expect to have any kind of volume here, meaning several or more concurrent processes trying to do this at the same time, I don't believe that would work very well.
You could wind up with two processes actually trying to insert the same key value here and one winding up with a duplicate key error. Remember, the locks acquired for reads are shared locks,
meaning they don't block anyone from reading at the same time. They also are released as soon as the read moves off the page (or row). They are not held for the duration of the transaction. Read up on isolation levels for more on this.
So if your code looked like this
insert data_table (my_key)
select max(my_key) + 1
the select max part could return the same value for two different processes.
Try this, define the DDL I sent earlier, then open two isql sessions and run this in each of them. This simulates two spids each attempting to insert 1000 rows. Before you say "my app will never do that", keep in mind that if the code allows a problem to occur, it will occur eventually.
declare @i int
select @i = 0
while @i < 1000
insert data_table (my_key)
select max(my_key) + 1