Page 1 of 2 12 LastLast
Results 1 to 15 of 20
  1. #1
    Join Date
    Feb 2005
    Posts
    3

    Unanswered: how to choose a Primary Key

    Hi All,

    I have a dilemn:
    On one side, I have a column C1 which could be a primary key because it is never null, the value is unique and identify the record. The problem is its a char type and its lenght can be close to 30.
    Then, I've planned to add another column C2 of int type as PK. But then I need to add a unique constraint index on C1. Does it improve performance anyway?

    Thanks

  2. #2
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    Once you've got a primary key, adding a Unique constraint doesn't help performance. You should still add the constraint, it just won't help your performance.

    -PatP

  3. #3
    Join Date
    Jul 2003
    Location
    San Antonio, TX
    Posts
    3,662
    I beg to differ, because unique constraint implicitly creates a unique index, thus - possibility of improving performance where the field(-s) is/are involved.
    "The data in a record depends on the Key to the record, the Whole Key, and
    nothing but the Key, so help me Codd."

  4. #4
    Join Date
    Feb 2004
    Location
    San Antonio, TX
    Posts
    565

    How to choose a primary key????????

    when all around you are losing their heads......

    to answer your question generally
    SQL Server 2000 Requires the primary key to be:
    Unique
    Not Null
    16 columns or less for a composite key.

    i cant help you with your di-lemon because the pk is such a personal thing. realistically your pk is your own and shouldnt be fondled by any other so i will go out on a limb here and say that what you have is fine.

    However, {Opinion} i am a big fan of the monotonic key. simple, to the point, and oh, so very integer....mmmmn i can see my key right now in a dimly lit room with nothing on but an identity of 1,1 and sumptuous nonclustered index fill-factored just right.

    i need a smoke..

  5. #5
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    Quote Originally Posted by Ruprect
    i need a smoke..
    My first guess would be that you need a bit more than a smoke, but you'll probably need to visit another web site for that!

    Quote Originally Posted by rdjabarov
    I beg to differ, because unique constraint implicitly creates a unique index, thus - possibility of improving performance where the field(-s) is/are involved.
    On a slightly more serious note, I see a declared Primary Key (PK) as being somewhere between very important and critical for a table. Very little relational algebra is possible without a PK.

    An Alternate Key (AK) or the equivalent unique constraint is a different matter. The AK should be declared in order to allow the database engine to do its job and ensure uniqueness of the AK. However, there is a cost assosciated with the AK, in that CPU and I/O time and disk space need to be used to enforce it. The benefits of a declared (and enforced) AK are important, but they aren't nearly as important in my opinion as the PK is. I've been willing to forego AK definitions if the potential benefits didn't outweigh the time/disk needed to acheive them.

    While rdjabarov is right, and the AK might be useful, there are times that it can do more harm than good. From the pure relational standpoint of managing the table, I see the AK as optional. Of course, if you need to ensure uniqueness or if you have queries that will use the index, and the cost of maintaining it is less than the expected benefit, then I'm in favor of it. In fact, I'll even say that I'm generally in favor of AK definitions, but that you still need to use some judgement instead of just jumping in and declaring them everywhere they could be used.

    -PatP

  6. #6
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Quote Originally Posted by Pat Phelan
    While ... the AK might be useful, there are times that it can do more harm than good.
    bumph and nonsense

    let's do an informal poll

    of all the different threads we've seen in database forums (and some of us visit more than this one), what's the most common question?

    that's right, "how do i remove my duplicates?"

    i would say that, on balance, you will do good for 99 people by insisting on the unique constraint on the alternate key, and harm for 1 person, although i am hard pressed to think of the circumstances where this might occur

    besides, if you (not you, pat, the reader) are really worried about the alleged "harm" of the unique constraint, then the solution is simple -- drop the stupid identity column and make the alternate key the real primary key

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  7. #7
    Join Date
    Feb 2004
    Location
    San Antonio, TX
    Posts
    565
    Quote Originally Posted by Ruprect
    However, {Opinion} i am a big fan of the monotonic key. simple, to the point, and oh, so very integer
    if you can revive some imperical statistics on the PK as Intelligent key vs PK as Monotonic key that would be excellent. but remember, this is a preference situation. sometimes the candidate is easily qualified to be a key and sometimes the addition of a surrogate is just as qualified.

    so what is it that makes this an issue.
    the inclusion of an additional column into the table that is taking up space in the dbf? or is it the suspected performance derived from SARGs based on monotonic integers?

    basically this whole crappy argument comes down to who can write their name in the snow the quickest.

    but for the sake of argument i will side with trotsky.

  8. #8
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    I guess I've just run into too many cases that push the limits of available hardware.

    One example of this is when gathering digital data from medical equipment. You often get data at rates that stress available hardware to the limit, and that data comes from well defined data sources. You often can't afford anything other than a PK on the collection table.

    Some of the columns are 200+ bytes wide, and there are six of those columns that form the Alternate Key. This practically doubles the size of the row for each AK index you use. The PK being only four bytes is negligable in comparison, and is a lot more valuable to me.

    While a bigger machine with a lot more disk would solve this particular problem, it would add a lot of cost and almost no benefit to the process.

    There are lots of cases where data is coming in at high speed to a single centralized server. Basically a web farm being serviced by an app farm being serviced by a SQL Server. In this case, one SQL Server might be effectively servicing 50000 simultaneous users as well as a dozen or so analysts. Even though you could create a natural key, and in this case you might even choose to declare it as an alternate, the hardware won't support using that natural key as a foreign key.

    Don't get me wrong... As I said in my previoius post I'm generally in favor of declaring AK even when I can't afford to use them as a PK, as long as it makes sense in the real world. Lots of things are really lovely in the gedankenexperiment that make a mess where the rubber actually meets the road!

    -PatP

  9. #9
    Join Date
    Feb 2005
    Posts
    3
    Thank you all for your helpful replies

    But I think I wasn't clear enough: what is better :
    - to keep my column of type varchar(30) as PK
    - to add another column of type int for example as PK and then add a unique constraint on my column of type varchar(30)

    Once again, thank you for your support

  10. #10
    Join Date
    Feb 2004
    Location
    San Antonio, TX
    Posts
    565
    rcomaz

    please provide a bit of background on your db
    what kind of data is being stored there?
    what is a basic business model summary?
    what kind of data si being stored in the TIC(table in question)?
    and how large is the TIC?
    what are the related columns in other tables that reference the TIC?

  11. #11
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    Quote Originally Posted by rcomaz
    But I think I wasn't clear enough: what is better :
    - to keep my column of type varchar(30) as PK
    - to add another column of type int for example as PK and then add a unique constraint on my column of type varchar(30)
    Geez! Interrupting all of our lovely philosophical debates just to get the original question answered!

    On a more serious note, it doesn't matter all that much as long as your database is both small (say under 20 Gb) and low traffic (under 1000 SQL statements processed per minute), especially if you rarely use foreign keys. As your size/traffic/complexity grow, I expect that you'll want to go toward the simple INT column. A four byte INT key takes less space in indices, foreign keys, etc than a thirty byte character key, so it takes corresponding less time/disk/etc. to process.

    -PatP

  12. #12
    Join Date
    Feb 2004
    Location
    San Antonio, TX
    Posts
    565
    hh;lkjhlkjolkhjlkjhn

  13. #13
    Join Date
    Jul 2003
    Location
    San Antonio, TX
    Posts
    3,662
    It's a no-brainer, just to second what Pat said, - it bottles down to how many values of a 4-byte size you can fit on a 8K page. This results in a number of logical reads when the optimizer scans through/seeks the index pages.
    "The data in a record depends on the Key to the record, the Whole Key, and
    nothing but the Key, so help me Codd."

  14. #14
    Join Date
    Jan 2004
    Location
    U.S.
    Posts
    26
    It can be important how the PK is indexed if you had the PK as varchar then there had to be a reason as to why. If you will be storing character data in the PK then use varchar and do a clustered index for better searching.

    Clustered Index = "When the DB searches data much like you would if you were trying to find a persons name in the phone book" Puts all the A's, B's, C's...... together for better performance.

    Index = "Is like when you are looking up the actual number" SQL will put the rows in numeric order

  15. #15
    Join Date
    Jul 2003
    Location
    San Antonio, TX
    Posts
    3,662
    PK, even if not clustered, has to yield a unique row per value, which fits your definition for a nonclustered index, which in tern fits your clustered index definition...what books are you reading??? Just a hint, - read from left to right, and once you reach the end of line, - do a carriage-return+line-fielf (vbCRLF, char(13)+char(10)) in your mind, so that you can continue reading the right way without confusing terms with definitions...But on a serious note, - come on, if you try to explain something to somebody, make sure you know it yourself, otherwise, - you're gonna run into a lot of trouble here
    "The data in a record depends on the Key to the record, the Whole Key, and
    nothing but the Key, so help me Codd."

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •