Results 1 to 4 of 4
  1. #1
    Join Date
    Apr 2004

    Normalisation question

    I have been performing normalisation of a dataset and was wondering if
    and where I should introduce a unique identifier for a table.
    Basically the table consists of 3 attributes: Item type (unique), item
    description and item expert.

    Whilst Item type is unique it is not a code and is made up of entries
    such as: 'Small copper widget' and 'medium steel widget', not exactly
    data entry proof. Therefore I want to introduce a new attribute
    called item code and make it the new primary key for this table which
    will consist of something much easier to input such as A1, A2, B3
    (etc.) or 001, 002, 003, etc. This would dramatically reduce the
    liklihood of data entry errors.

    At what point of the normalisation process can I introduce this new PK
    (if at all), this is for a uni assignment so it needs to be the
    technically correct point if there is one.

  2. #2
    Join Date
    Sep 2002
    This decision is "orthogonal" to the normalisation process. For example, if you look at any of C J Date's books on databases his examples always use codes like P1, P2 for part numbers at all stages of normalisation.

  3. #3
    Join Date
    Jun 2003
    Toronto, Canada
    I may be wrong but data entry reliability (the main reason for you to bring up an artificial attribute) has nothing to do with normalization. A primary key doesn't have to be a "code"; in your example you introduce the "code" to enhance application functionality and, possibly, make programming easier.

    So, the answer to your question I think should be: you introduce your new PK _after_ normalization has been done, when developing a physical model of your data.

  4. #4
    Join Date
    Jan 2003
    Pittsburgh, PA
    You typically are better having a PK independent of all data needs, and use your PK only from the db's perspective (coding, and foreign keys, etc.); using something like an Identity or autonumber often works well (unless you have an application the prefers to generate its own primary key independent of the dB). A "code" could be an "alternate primary key"; in essense another column that you put a constraint on requiring a unique value. By doing this, you could change your "code" for a particular record, later in time, and not mess up any relationships you have.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts