Results 1 to 9 of 9
  1. #1
    Join Date
    Jan 2016
    Posts
    2

    Unanswered: Autonumber as Primary Key

    Hi, this is my situation: I use SSIS to bring in data from a separate SQL database to populate a table. It is during this process that I populate an ID field in the new table that is an integer, is auto number and acts as a primary key. Using SSIS I update the table every night. There is someone else in my office who then uses Access as a front end to create forms etc. to link to that table to populate other tables. She uses the ID field as a primary key in order to link to other tables; this ID field becomes development permit number. My question is when data is added or deleted in the original database, what happens to the auto numbering when data is added or deleted? Once an ID number has been assigned to a record, I don't want a different number assigned when I run the update module.
    Basically I have a database in SQL for our GIS that brings in data from a proprietary database. I have created a database for development permits that combines financial data from the proprietary database with GIS data so that the data can be mapped.
    Thanks,
    Michael Kohler

  2. #2
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    Once you assign a value to a column using a SEQUENCE or an IDENTITY attribute, that number won't change again automatically. They assign the initial value, but that's all that they do... You can change it by assigning a different value, but it won't change on its own.

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  3. #3
    Join Date
    Jan 2016
    Posts
    2

    Autonumber as Primary Key

    Hi thanks for the reply and sorry for the delay I've been really busy. I understand your reply that once the identity number has been assigned it won't change again, but in my situation the table's content is deleted and repopulated with data I bring in through the SSIS job. So, if there are any changes made to the financial data, i.e. adding or deleting records wouldn't it alter the sequence in which the identity values are assigned?
    Michael

  4. #4
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    No. An Identity attribute controls the value given to a column when a row is inserted. That's all.

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  5. #5
    Join Date
    Jan 2003
    Location
    Massachusetts
    Posts
    5,794
    Provided Answers: 11
    Technically, if you delete the rows, then the identity column will pick up wherever it left off. So, if you insert 100 records, they will have (by default) IDs 1 - 100. If you delete all rows from the table, the next insert will be given ID 101.

    If you reset the identity column (left as an exercise for the interested reader), there is not much of a guarantee that the record that got ID 55 will be the same record that gets ID 55 in each run. If you are adding/removing records, it becomes nearly a certainty that the records will not get the same IDs from run to run.

    The short of it is that the identity column is an artificial value that has no relation to the data, or the process that created (inserted) that data, and can't be depended on to identify a particular record. It only serves as a sort of bookmark.

    Just remember the words of Homer Simpson when he joined the Stonecutters:
    Quote Originally Posted by Homer Simpson
    It's wonderful, Marge. I've never felt so accepted in all my life. These people looked deep within my soul and assigned me a number based on the order in which I joined.

  6. #6
    Join Date
    Jan 2013
    Posts
    353
    Provided Answers: 1

    Why not use GIS tools?

    [quote] It is during this process that I populate an ID field [sic] in the new table that is an integer, is auto number and acts as a primary key.
    [\quote]

    There are some problems here. There is no such thing as a generic "ID" in RDBMS; it has to be the identifier of something in particular, it has to be verifiable, it has to be validated. It looks like what you are doing is mimicking a 1960's man tape file, which is why you confuse the column with a field. Magnetic tape files have fields; relational tables have columns. The term "field" in SQL means part of a temporal value (year, month, day, hour, minute, second).


    There is someone else in my office who then uses Access as a front end to create forms etc. to link [sic] to that table to populate other tables. She uses the ID field [sic] as a primary key in order to link [sic] to other tables; this ID field [sic] becomes development permit number.
    Access? It is a terrible product and is been replaced by much better stuff for years. And no she is not using it is a key! I key is a unique combination of columns, which model attributes of the entity in the table. This is a record number that mimics a magnetic tape file. Also the term "link" is not part of RDBMS; it refers to a pointer chain construct in network databases from the 1970's network databases.

    My question is when data is added or deleted in the original database, what happens to the auto numbering when data is added or deleted?

    Please think about this for a minute. If you had a database of automobiles, you would use the VIN as the key, right? What would you expect to to help and to the other VINs when you add or remove or alter information to this table of automobiles? Because the VIN is a key, a real key, it has not changed anything else.

    Once an ID [sic] number has been assigned to a record [sic], I don't want a different number assigned when I run the update module.
    Rows are not records, but they do exist on magnetic tape files! Can you see how your missed using SQL?

    Basically I have a database in SQL for our GIS that brings in data from a proprietary database. I have created a database for development permits that combines financial data from the proprietary database with GIS data so that the data can be mapped.
    I will guess you are on ArcGIS, just because of market share.. Why do not you look at what the vendor has in the way of tools to get such a data into the GIS? Their stuff is usually quite good, and you will not have to reinvent the wheel..

  7. #7
    Join Date
    Nov 2004
    Posts
    1,425
    Provided Answers: 4
    Hey Michael,

    this ID field becomes development permit number.
    I'm not familiar with that. What does that mean?

    As others have stated, adding a PK to your table by using an IDENTITY or SEQUENCE will not solve your specific problem. I guess you TRUNCATE or DELETE the table each day before repopulating it from the GIS system. So each day the GIS-data get different ID-values. All other tables referencing to your table now reference to non-existent records. If you created a FK constraint, SQL Server will protect you from deleting a referenced record. But I guess no FK constraints were created.

    There are two ways to solve this problem:

    A) Find the primary key of the GIS-data that you import, import that column too and use that as a PK. Each day, empty your table and reload it from the GIS-system. This is the easiest solution.

    B) An alternative is: load the GIS-data in a staging table (including the PK or business key BK from the GIS-data), determine the differences (New, altered, deleted or equal; I use the abbreviation CUDE ("C"reate, "U"pdate, "D"elete or "E"qual)) between the staging table and your production table by using the PK or BK to join both tables, modify the data in your production table so it becomes "equal" to the data in the staging table. Use logical deletes (IsActive (or IsDeleted) 1/0) instead of physical deletes.

    Use A) when the recordset the GIS system gives today will always contain the records it gave in the past. A reference to a record the Access-lady made today, will reference to the same data tomorrow, if the PK of the GIS-system is used

    Use B) when the recordset the GIS system gives today will NOT always contain the records it gave in the past: like data from a closed project, or the previous year, ... is not longer available (or the data amount becomes too big). By using logical deletes, references to deleted data will not result in PK violation errors. You may have to tweak the CUDE-determining code to take into account that only data from the current year, project , ... must be considered, or it will mark all records from previous years, projects, ... as Deleted.
    I think you will have to use B).

    As Celko advised, you will need to import the PK (or BK) of the GIS-data. No matter what solution you need.

    As a rule of thumb, don't use a Key you have not in control in your internal system or to export to others. Don't use the PK of the GIS-system, add your own ID to your production table and let the other tables in your database reference that ID. Also, for the the key that you will publish to others don't use the ID, create an extra column (string). Even when that key is numeric, declare it as a string (VARCHAR or NVARCHAR). Your needs may alter in time and a number may become a number followed by a "-" followed by another number ... A string can handle everything your boss may come up with.
    With kind regards . . . . . SQL Server 2000/2005/2012
    Wim

    Grabel's Law: 2 is not equal to 3 -- not even for very large values of 2.
    Pat Phelan's Law: 2 very definitely CAN equal 3 -- in at least two programming languages

  8. #8
    Join Date
    Nov 2004
    Location
    out on a limb
    Posts
    13,692
    Provided Answers: 59
    an autogenerated / autonumber column shgould have no meaning outside the db. ontghe face of it is temtping to use autonumber for invoice number, GRN's and so on. BUT it is not suitable as you have no control of that number, when its allocated, what is the sequence and do on.

    there's plenty of reasons to use an autonumber/autogenerated column, and plenty why you shouldn't (go google 'em...). your intended use demonstrates just one of the reasons why its not appropriate.
    if the data is coming from an external system that external system probably has its own unique identifier. by unique identifier that means some column or combination of columns that guarantees a row can be uniquely identified. if that is present in your data stream then use it.

    sadly Access does have a habit of suggesting an autonumber column as a PK, jsut because Access suggests it doesn't mean you should use it. use an autonumber key when there are no other clear standout better solutions. of then thso e better solutions are truly far better. ferinstance say you have an order & order details table
    your Auditors will almost certainly want a sequential number for order number... so autogenerated coudl be tempting... but you need to create your own sequence
    in the order details you coudl use an autogenerated column (as it doens't have any value outside the system. but you'd be better off considering if there is a so called natural candidate.
    you could use a composite order number + line number.
    you could use a composite order number + product number.

    ..either are valid and either may be appropriate.... using a line number means your recording of the products ordered shoudl be in the same sequence as the the customer ordr. makes accounts handkling/order processign fit with their order. so fewer chances to screw things up
    using the order number + product number also makes sense as it enforces a rule that you can only order the same product once in the same customer order... operationally that makes sense.... easier on order picking, delivery and so on...
    I'd rather be riding on the Tiger 800 or the Norton

  9. #9
    Join Date
    Jan 2003
    Location
    Massachusetts
    Posts
    5,794
    Provided Answers: 11
    Another fine reason for not showing any user the Identity value is that SQL Server now (SQL 2012 and higher) caches 1000 values for any identity column. If SQL Server should restart, the next ID value assigned is about 1000 more than you may expect. (More information here in the "Consecutive values after server restart or other failures" section.

    We had an application here that cheated, and did not tell me they were showing the identity value to an end user. That end user was shocked to see their ID value jump by 1000 on Patch Day.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •