Results 1 to 12 of 12

Thread: Primary Key

  1. #1
    Join Date
    Feb 2004
    Posts
    134

    Unanswered: Primary Key

    I am setting up some tables where I used to have an identity column as the primary key. I changed it so the primary key is not a char field length of 20.

    Is there going to be a big performance hit for this? I didn't like the identity field because every time I referenced a table I had to do a join to get the name of object.

    EG:
    Code:
     
    -- Old way
    tbProductionLabour
    ID (pk)| Descr | fkCostCode
    ----------------------
    1  | REBAR | 1J
    
    tbTemplateLabour
    fkTemplateID | fkLabourID | Manpower | Hours
    --------------------------------------------
    1                 | 1             | 1             | 0.15
    
    -- New way
    tbProductionLabour
    Labour | fkCostCode
    ---------------------
    REBAR  | 1J
    
    tbTemplateLabour
    fkTemplateID | fkLabour | Manpower | Hours
    -------------------------------------------
    1                 | REBAR    | 1             | 0.15
    This is a very basic example, but you get the idea of what I am referring to.

    Any thoughts?

    Mike

  2. #2
    Join Date
    Nov 2002
    Location
    Jersey
    Posts
    10,322
    Quote Originally Posted by MikeB_2k4
    I didn't like the identity field because every time I referenced a table I had to do a join to get the name of object.
    [thud]
    The light! Don't look at the light!

    I guess I'll add you to the "not preffering surrogates" group

    http://weblogs.sqlteam.com/brettk/ar...6/09/1530.aspx

    Excuse me while I climb back upon my barst...um desk chair...yeah that's right...

    [/thud]
    Brett
    8-)

    It's a Great Day for America everybody!

    dbforums Yak CorralRadio 'Rita
    dbForums Member List
    I'm Good Once as I ever was

    The physical order of data in a database has no meaning.

  3. #3
    Join Date
    Nov 2002
    Location
    Jersey
    Posts
    10,322
    EDIT: Didn't we have this conversation already?
    Brett
    8-)

    It's a Great Day for America everybody!

    dbforums Yak CorralRadio 'Rita
    dbForums Member List
    I'm Good Once as I ever was

    The physical order of data in a database has no meaning.

  4. #4
    Join Date
    Feb 2004
    Posts
    134
    Quote Originally Posted by Brett Kaiser
    EDIT: Didn't we have this conversation already?
    Probably, but as an in-experienced developer (wannabe) I was wondering if not using a identity field will really make that great of a performance difference.

    I think the char(20) for a primary key is better mainly because when I open a table I am not seeing a number which I will have to look up. It also would reduce the number of joins I would have to make (which I guess would be better for performance). But, for looking up values or joining on that field, would the performance reduction be neglable or significant enough to want to use the identity field?

    Mike

  5. #5
    Join Date
    Nov 2002
    Location
    Jersey
    Posts
    10,322
    Let me ask you this...

    What do you think would be faster.

    A). An umpteen table join on surrogate keys to get the data, or

    B). A SELECT against 1 Table
    Last edited by Brett Kaiser; 09-23-04 at 15:31.
    Brett
    8-)

    It's a Great Day for America everybody!

    dbforums Yak CorralRadio 'Rita
    dbForums Member List
    I'm Good Once as I ever was

    The physical order of data in a database has no meaning.

  6. #6
    Join Date
    Feb 2004
    Posts
    134
    Quote Originally Posted by Brett Kaiser
    Let me ask you this...

    What do you think would be faster.

    A). An umpteen table join on surrogate keys to get the data, or

    B). A SELECT against 1 Table
    I would assume option B, but let me ask you this.

    What do you think would be faster:
    A) Looking up values based on an integer
    or
    B) Looking up values based on an umpteen char string

    ?

    Mike B

  7. #7
    Join Date
    Nov 2002
    Location
    Jersey
    Posts
    10,322
    OK, I'll give you 80/1000 of a second

    Code:
    USE Northwind
    GO
    
    SET NOCOUNT ON
    CREATE TABLE myTable99(Col1 sysname)
    CREATE TABLE myTable00(Col1 sysname, Col2 int IDENTITY(1,1))
    GO
    
    DECLARE @x int
    SELECT @x = 1
    WHILE @x < 1000
      BEGIN
    	INSERT INTO myTable99(Col1) SELECT TABLE_NAME FROM INFORMATION_SCHEMA.Tables
    	INSERT INTO myTable00(Col1) SELECT TABLE_NAME FROM INFORMATION_SCHEMA.Tables
    	SELECT @x = @x + 1
      END
    
    INSERT INTO myTable99(Col1) SELECT 'Brett'
    
    CREATE INDEX myIndex99 ON myTable99(Col1)
    CREATE INDEX myIndex00 ON myTable00(Col2)
    
    SELECT COUNT(*) FROM myTable99
    
    DECLARE @x1 datetime, @y1 datetime, @x2 datetime, @y2 datetime
    SELECT @x1 = GetDate()
    SELECT @x1 AS systime, 'Starting int look up'
    SELECT * FROM myTable00 WHERE Col2 = 216784
    SELECT @y1 = GetDate()
    SELECT @y1 AS systime, 'Endinging int look up'
    
    SELECT @x2 = GetDate()
    SELECT @x2 AS systime, 'Starting sysname look up'
    SELECT * FROM myTable00 WHERE Col1 = 'Brett'
    SELECT @y2 = GetDate()
    SELECT @y2 AS systime, 'Endinging int look up'
    
    SELECT DATEDIFF(ms,@x1, @y1), DATEDIFF(ms,@x2, @y2)
    GO
    
    SET NOCOUNT OFF
    DROP TABLE myTable99
    DROP TABLE myTable00
    GO
    Brett
    8-)

    It's a Great Day for America everybody!

    dbforums Yak CorralRadio 'Rita
    dbForums Member List
    I'm Good Once as I ever was

    The physical order of data in a database has no meaning.

  8. #8
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    For small tables with no updates, it doesn't matter much. If you start to update the char values you are using as keys, you'll lose hair very quickly. If you add rows (beyond about 100,000 or so), the numeric keys will be significantly faster, due to lower total physical IO.

    The short answer boils down to you can use what you want. As you scale upward, the surrogate keys look better and better!

    -PatP

  9. #9
    Join Date
    Feb 2004
    Posts
    134
    Quote Originally Posted by Pat Phelan
    For small tables with no updates, it doesn't matter much. If you start to update the char values you are using as keys, you'll lose hair very quickly. If you add rows (beyond about 100,000 or so), the numeric keys will be significantly faster, due to lower total physical IO.

    The short answer boils down to you can use what you want. As you scale upward, the surrogate keys look better and better!

    -PatP
    Yeah, I pretty much aggree with that. The tables I am refering to will not be that big and they are kind of complex so I thought it would be best to use as many natural keys as possible. Of course there are places in my DB where I use the identity because I don't think the natural keys are all that good to use.

    How many people use CompanyName as a natural key in a companies table?

    I understand there maybe more then one CompanyName in the table but should they be unique by appending a number or geographical location, or using the CompanyName / Address as the primary key.

    My thought is that it is best to use a surrogate here because to carry a company name and address as a forein key to other tables is probably costly.

    There is the argument that the address can change but is that really a problem since you can specify "Cascade update related fields" on the other tables?

    I would personally love to see something other then a number when I am looking at these tables but.....

    Mike B

  10. #10
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    At least in my opinion, company name stinks as a primary key. We aren't all that big, but we have several hundred companies scattered wildly about North America with the same name, and on a worldwide basis it gets even worse.

    JOINs are cheap. SQL Server makes them nearly free IF you keep the FK value small (INT or smaller) and you've got enough RAM in your server.

    -PatP

  11. #11
    Join Date
    Nov 2002
    Location
    Jersey
    Posts
    10,322
    Quote Originally Posted by Pat Phelan
    JOINs are cheap.
    That's gotta be the most open ended statement I've heard in a while...

    Also... "Company's with the same name scattered around"?

    Either they truly are a different company, which means they are separate legal entity, or they are a site for a company....

    Are you essentially saying that accessing 1 table would be slower than accessing many?
    Brett
    8-)

    It's a Great Day for America everybody!

    dbforums Yak CorralRadio 'Rita
    dbForums Member List
    I'm Good Once as I ever was

    The physical order of data in a database has no meaning.

  12. #12
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    Many are franchises, some just reuse common names in different jurisdictions. The net result is that if you look for companies with names like Subway or McDonald's you find hundreds of hits, most (but not all) with separate EIN values.

    If you have to haul a fifty byte VARCHAR off the disk versus a four byte integer for every row in a 34 million row table, versus a join to a table cached in RAM, then the JOIN is cheaper than the single table. While SQL Server is good at hiding physical details from the user, some things are still big enough tasks so that smart design beats brute force every time.

    -PatP

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •