Results 1 to 11 of 11
  1. #1
    Join Date
    Oct 2011
    Posts
    1

    Designing Database to compare courier rates

    I want to design a database to compare courier rates to different cities.
    It will be my first relational database. I thought it would be easy (and it probably is) but I am having trouble with the concept.

    My plan was 3 tables:

    Courier
    courierName - PK
    contact
    phone
    email

    Destination
    destID - PK
    city
    state
    distance

    Rate
    rateID - PK
    destID - FK
    courierName - FK
    price
    pricePerDistance


    Hopefully the fields are pretty self explanatory but basically I would want to do a search for "Springfield" and the database would return something like

    Springfield PA Purolator $50
    Springfield PA UPS $60
    Springfield PA DHL $70
    Springfield VA Purolator $70
    Springfield VA UPS $60
    Springfield VA DHL $50

    Do my tables look right? I was having a hard time wrapping my head around the duplicate city issue. should "State" be a separate table?

    Any help would be greatly appreciated

    Thanks!

  2. #2
    Join Date
    May 2008
    Posts
    277
    For the most part you seem to be in the ballpark.

    Be careful with your primary keys. Some of the them (destid, rateid) appear to be surrogate keys. Even if you decided to use a surrogate key as your primary key, you need to enforce the natural key using a UNIQUE constraint. For destination, the natural key would be city and state. For rate, it would be courier name and destination.

    By "duplicate city issue," I presume you mean that some cities have the same name? That is why you need to include the state as part of your key. I would suggest creating a separate "look-up" table that contains all the states. The destination table would then hold a foreign key to the state table.

    Is there a difference between the price and pricePerDistance columns in the rate table?

    Finally, your database seems to assume the starting point is always constant (wherever that may be). I don't know what you're using this for, but you might want to consider how you would handle deliveries between any two destinations.

  3. #3
    Join Date
    Nov 2011
    Posts
    30
    Rog345

    Ok, I'm not sure how to answer this question because I don't know exactly what you're looking for, but I'll give it a shot.

    This is example of what I believe you are looking for: You want to know how much money it costs for a courier to deliver a package to a certain destination.

    Do my tables look right? I was having a hard time wrapping my head around the duplicate city issue. should "State" be a separate table?
    The problem with having State has a table is that you can't really describe it. Meaning, what else are you going to put in the table besides state? If you only have one attribute, which is the primary key, the table is not worth saving, in most cases. I believe that futurity does make a good argument about putting state as a primary key, although, this goes against theory.

    One question I do have is within the Destination field, you have distance. Now, what does this mean exactly? The distance from your house/business to the destination? This could be taken many ways, but it's important to know this because it could very well change the structure of the database.

    I understand that when you put in a search for "Springfield" all of those records come up. What you could do is create a query and search for Springfield (from the city) and maybe the courier name. Again, I'm not sure exactly what you're looking for, but if you're trying to eliminate the city "Springfield" from appearing too many times when you do a search, don't worry about it. The database is doing exactly what it's supposed to be doing. One thing I would do is remove the RateID from the Rate table. It goes against theory because there can only be one primary key per table and you already have one (the composite key, which is the combination between destID and couriername). The combination of both the destID and couriername creates your unique record, so the rateID is not needed.

    Other than that, I think that your logic is sound.

    Please post back for any more questions.

    Martin

  4. #4
    Join Date
    May 2008
    Posts
    277
    Quote Originally Posted by Martin22 View Post
    The problem with having State has a table is that you can't really describe it. Meaning, what else are you going to put in the table besides state? If you only have one attribute, which is the primary key, the table is not worth saving, in most cases.
    This is certainly not a problem. Just because a table only has one column -- or more accurately, just because it's only columns are the primary key -- does not mean the information is not worth saving.

    On a practical level, the possible values for state are discrete. It's perfectly correct to put these values in a table and use a foreign key constraint to ensure that all occurrences of a state in the database are valid.

    On a theoretical level, a state is an entity quite independent from a destination (and for that matter, so is a city, the list of which also happens to be discrete, just much larger). Just because we don't record a destination to a particular state (or city) does not mean that the state (or city) doesn't continue to exist on its own. Similarly, if we add a new destination to a state that hasn't been previously recorded, that does not mean the state suddenly pops into existence. We may be interested in recording the existence of an entity even if we aren't interested in recording any extra information about it. It is perfectly correct to record the list of possible states in a table independent of the list of destinations.

    That said, there are design decisions to be made about how much "reference" data to include in the database. The OP's needs might be met by a single destination table. An enterprise database may have a table of states as well as a comprehensive table of cities in addition to the destination table. Either way, he needs to be aware of the consequences of his decisions. Even if he decides to only record those states for which he also records a destination, not having a separate table for states leaves him susceptible to problems such as entering "Boston, Massachusetts" and "Springfield, Massachussetts" as destinations.

    I believe that futurity does make a good argument about putting state as a primary key, although, this goes against theory.
    Certainly not.

    The OP's original table is:
    id -> city, state, distance

    First, let's dispense with the notion that using a surrogate key does anything for data integrity:
    Code:
     id |     city    | state | distance
    ----+-------------+-------+----------
     1  | Boston      | MA    | 100
     2  | Springfield | MA    | 50
     3  | Springfield | IL    | 750
     4  | Springfield | MA    | 50
    So we've satisfied the "primary" key, yet we have duplicate data. And this is the probably the best case scenario. Consider:
    Code:
     id |     city    | state | distance
    ----+-------------+-------+----------
     1  | Boston      | MA    | 100
     2  | Springfield | MA    | 50
     3  | Springfield | IL    | 750
     4  | Springfield | MA    | 75
    So once we throw out id as a key, we need determine the "real" (ie, natural) key. City does not suffice, because even though we'll eliminate the duplicate "Springfield, MA" entries, it also prevents us from entering "Springfield, IL". Incorporating state into the key solves this:

    city, state -> distance

    Now once again, design decisions may lead us to add a surrogate key to the table and use that for foreign key references -- which is exactly what "surrogate" means: a substitute for the real thing -- however that does not eliminate the need to enforce the functional dependency given above.

    Now, if you're claiming that it's wrong to have multiple keys in one table, then I would direct you to this thread.

  5. #5
    Join Date
    Nov 2011
    Posts
    30
    Quote Originally Posted by futurity View Post
    This is certainly not a problem. Just because a table only has one column -- or more accurately, just because it's only columns are the primary key -- does not mean the information is not worth saving.

    On a practical level, the possible values for state are discrete. It's perfectly correct to put these values in a table and use a foreign key constraint to ensure that all occurrences of a state in the database are valid.

    On a theoretical level, a state is an entity quite independent from a destination (and for that matter, so is a city, the list of which also happens to be discrete, just much larger). Just because we don't record a destination to a particular state (or city) does not mean that the state (or city) doesn't continue to exist on its own. Similarly, if we add a new destination to a state that hasn't been previously recorded, that does not mean the state suddenly pops into existence. We may be interested in recording the existence of an entity even if we aren't interested in recording any extra information about it. It is perfectly correct to record the list of possible states in a table independent of the list of destinations.

    That said, there are design decisions to be made about how much "reference" data to include in the database. The OP's needs might be met by a single destination table. An enterprise database may have a table of states as well as a comprehensive table of cities in addition to the destination table. Either way, he needs to be aware of the consequences of his decisions. Even if he decides to only record those states for which he also records a destination, not having a separate table for states leaves him susceptible to problems such as entering "Boston, Massachusetts" and "Springfield, Massachussetts" as destinations.



    Certainly not.

    The OP's original table is:
    id -> city, state, distance

    First, let's dispense with the notion that using a surrogate key does anything for data integrity:
    Code:
     id |     city    | state | distance
    ----+-------------+-------+----------
     1  | Boston      | MA    | 100
     2  | Springfield | MA    | 50
     3  | Springfield | IL    | 750
     4  | Springfield | MA    | 50
    So we've satisfied the "primary" key, yet we have duplicate data. And this is the probably the best case scenario. Consider:
    Code:
     id |     city    | state | distance
    ----+-------------+-------+----------
     1  | Boston      | MA    | 100
     2  | Springfield | MA    | 50
     3  | Springfield | IL    | 750
     4  | Springfield | MA    | 75
    So once we throw out id as a key, we need determine the "real" (ie, natural) key. City does not suffice, because even though we'll eliminate the duplicate "Springfield, MA" entries, it also prevents us from entering "Springfield, IL". Incorporating state into the key solves this:

    city, state -> distance

    Now once again, design decisions may lead us to add a surrogate key to the table and use that for foreign key references -- which is exactly what "surrogate" means: a substitute for the real thing -- however that does not eliminate the need to enforce the functional dependency given above.

    Now, if you're claiming that it's wrong to have multiple keys in one table, then I would direct you to this thread.
    I understand that your methodology is sound, although, two primary keys are not allowed. Please refer to this link: CIS-165: 4. Designing Databases - CIS-165PH - Cabrillo College

    Theory is very hard to follow, especially if you want to do the job correctly.

    So, a table can have only one primary key, but can have multiple keys (called a composite primary key), which is also a constraint.

    Martin

  6. #6
    Join Date
    May 2008
    Posts
    277
    Quote Originally Posted by Martin22 View Post
    I understand that your methodology is sound, although, two primary keys are not allowed. Please refer to this link: CIS-165: 4. Designing Databases - CIS-165PH - Cabrillo College
    I did not suggest having multiple primary keys on the table. The page you linked to describes candidate keys, from which a primary key is selected. As it stands, the destination table has one candidate key:
    city, state -> distance

    If we decide to add a surrogate key, then we've created another candidate key:
    city, state -> id, distance
    id -> city, state, distance

    One of these is selected as the primary key. Presumably, we'll use the surrogate key, since we went through the trouble of adding it. However, the other candidate keys remain just a valid, and must still be enforced:
    Code:
    create table destination (
       -- surrogate key
        destination_id int primary key,
    
        -- natural key
        city_name varchar(50) not null,
        state_code char(2) not null,
        unique (city_name, state_code),
    
        distance int not null check (distance > 0)
    );
    So, a table can have only one primary key, but can have multiple keys (called a composite primary key), which is also a constraint.
    A table can have any number of candidate keys. A composite key is a single key comprised of multiple columns (see my table definition for the natural key, above). A primary key is simply a candidate key we have designated as being "primary," otherwise there is no substantive difference from the other candidate keys, though there is perhaps an implicit understanding that the designated primary key will be relatively stable over time (whereas the other candidate keys may not).

    Everything I've said here is consistent with the page you linked to.

  7. #7
    Join Date
    Nov 2011
    Posts
    30
    Quote Originally Posted by futurity View Post
    I did not suggest having multiple primary keys on the table. The page you linked to describes candidate keys, from which a primary key is selected. As it stands, the destination table has one candidate key:
    city, state -> distance

    If we decide to add a surrogate key, then we've created another candidate key:
    city, state -> id, distance
    id -> city, state, distance

    One of these is selected as the primary key. Presumably, we'll use the surrogate key, since we went through the trouble of adding it. However, the other candidate keys remain just a valid, and must still be enforced:
    Code:
    create table destination (
       -- surrogate key
        destination_id int primary key,
    
        -- natural key
        city_name varchar(50) not null,
        state_code char(2) not null,
        unique (city_name, state_code),
    
        distance int not null check (distance > 0)
    );


    A table can have any number of candidate keys. A composite key is a single key comprised of multiple columns (see my table definition for the natural key, above). A primary key is simply a candidate key we have designated as being "primary," otherwise there is no substantive difference from the other candidate keys, though there is perhaps an implicit understanding that the designated primary key will be relatively stable over time (whereas the other candidate keys may not).

    Everything I've said here is consistent with the page you linked to.
    I perfectly understand what you're saying. The terminology threw me for a loop, so I do apologize for that. I meant no disrespect in challenging your opinion. My professor drills us with using theory, as he tells us many databases go again it (theory that is). I'm a senior at Rowan University (you've probably never heard of it before), so I try to help others understand databases as I do. I figured that interrupting it another way may help others understand the concept better. It isn't easy to learn, but I feel in love in them, haha.

    Thanks for replying though!

    Martin

  8. #8
    Join Date
    Dec 2007
    Location
    London, UK
    Posts
    741
    Quote Originally Posted by Martin22 View Post
    I understand that your methodology is sound, although, two primary keys are not allowed. Please refer to this link: CIS-165: 4. Designing Databases - CIS-165PH - Cabrillo College

    Theory is very hard to follow, especially if you want to do the job correctly.

    So, a table can have only one primary key, but can have multiple keys (called a composite primary key), which is also a constraint.
    This has come up a few times but it's maybe worth going over again.

    There is no practical or theoretical reason why you can't have more than one "primary" key on a table. In relational database theory terms a primary key means exactly the same as a candidate key. There could be any number candidate keys and so you can just as easily call zero, one or more of them "primary" without altering the structure, meaning or function of the schema in any way. That was in fact the way in which E.F.Codd originally used the term "primary key" - he meant any and all candidate keys and not just one key per table.

    It is only convention and convenience that leads people to select one and only one key as "primary" so it really doesn't mean anything much to say that there must only be one primary key. That "rule of one" is only as important as you want it to be.

    In my view, for practical purposes the designation of a primary key at all is very often unnecessary and arbitrary. It's common enough for a table to have more than one key. Often the same data is shared between a range of different people and processes and those different people and processes may make use of different identifiers for the same piece of data. So where you do have multiple keys it may be that one key becomes more significant or preferred over another depending on the context. The idea that there is or should be just one universally "preferred" identifier is increasingly irrelevant and obsolete. In practice and as a matter of principle all keys are equal. At least that's the way it ought to be.

  9. #9
    Join Date
    Nov 2004
    Location
    out on a limb
    Posts
    13,692
    you can have as many unique keys as you wish, although for the life of me I can't see why you'd have more than two. the term primary key has more to with the human language rather than the database language.

    yes in a relational DB there must be a primary key which is guaranteed to uniquely identify a row, but you can also have another index which is unique. some times people choose to use a surrogate key (an system generated autonumber) in place of a composite key for prerformance or ease of use (especailly if the natural key is a multi column composite key. if there are two (or more) indexes which are unique then de facto you have two (or more indexes) which are interchangeable as a primary key, although the database implementation will only recognise one key as 'the primary key'.

    if there is a natural candidate key, then in my books you, which is what futurity as alluding to in post #2. however its perfectly reasonable, and understandable if a designer choose an autonumber column to simplify the design if say the natural key was, say a composite of, say 4 or more columns. those 4 or more columns would still appear in the table, and a canny developer would create a unique index of those columns to ensure you reinstalled the data integrity that you lost be declaring an autonumber column as your primary key.
    I'd rather be riding on the Tiger 800 or the Norton

  10. #10
    Join Date
    May 2008
    Posts
    277
    And to build on dportas's point, many (if not all) DBMS's allow foreign keys to reference ANY unique index on a table, not just the primary key. There is literally no difference between using PRIMARY KEY or using UNIQUE NOT NULL, and in fact generally there is no requirement to even declare a primary key (although it would certainly behoove you to declare at least one unique, not-null index).

    From that perspective, I wonder it might actually be more useful to the student to dispense with the notion of a "primary" key. There often seems to be a fundamental misunderstanding that once you've declared and enforced a primary key, all other keys cease to be relevant or valid. Surrogate keys are not so much a problem in that people use them, but that in declaring them to be the primary key, they no longer enforce the other/original keys. This isn't just a problem that occurs with surrogate keys, but it's certainly the most prevalent case.

  11. #11
    Join Date
    Dec 2007
    Location
    London, UK
    Posts
    741
    Quote Originally Posted by futurity View Post
    From that perspective, I wonder it might actually be more useful to the student to dispense with the notion of a "primary" key.
    Definitely! In my view the notion of "primary" keys has historically been given far too much prominence, often to the detriment of understanding real data modelling and data management issues. What students of database design ought to learn is the importance of keys in general: the issues arising in the choice, design and enforcement of all keys are significant. By contrast, as far as relational database design is concerned the designation of one "primary" key is usually a matter of very little importance - worthy of a footnote or a paragraph but not much more.

    Here's a little experiment that students might like to try. Whenever course materials or teachers mention the phrase "primary key" do a mental check as to whether A) they truly are describing something that must apply to only one key per table, or B) they mean something that could apply equally to any and every key of a table. Nine times out of ten the answer will probably be B). Hence this useful rule of thumb: There's a 99% chance that any statement that includes the phrase "primary key" is either wrong or irrelevant.

    Quote Originally Posted by healdem View Post
    yes in a relational DB there must be a primary key which is guaranteed to uniquely identify a row, but you can also have another index which is unique.
    Indexes fundamentally have nothing to do with keys. A key is a set of attributes which are required to be unique. An index is just a mechanism for speeding up access to data. Indexes are often used to support key enforcement but should never be confused with keys. It does seem that people regularly mix up the concepts of keys and indexes - probably due to some of the "primary" key misconceptions already mentioned.

    Quote Originally Posted by healdem View Post
    I can't see why you'd have more than two [keys].
    Why shouldn't things have more than two identifiers if it makes sense of some business requirement? Two possible examples off the top of my head: login name, payroll number and tax reference for employees. SKU, supplier part number and UPC for stock control.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •