Results 1 to 4 of 4
  1. #1
    Join Date
    Aug 2004

    Unanswered: surrogate or composite key?

    The orininal design of my db (part of it...) is the following

    A JOB has a Number and a Description.
    Each JOB can have one or two TASKS (min one, max two). Each TASK is identified by the JOB it belongs to and an Index (unique only for the same JOB).
    Each TASK has one an only one set of INFO1, one and only one set of INFO2, one and only one set of INFO3 etc.

    A: JOB (JobNum [PK], JobDescription, ...)
    B: TASK (JobNum [PK] [FKa], Index [PK], TaskDescription, ...)
    C: INFO1 (JobNum [PK] [FKb], Index [PK] [FKb], ...)
    D: INFO2 (JobNum [PK] [FKb], Index [PK] [FKb], ...)

    (There is a reason to keep INFO1, 2 and 3 separate, because eachof them will be linked to different table. This might influence the answer to my real question.)

    First of all, I wouldn't add any surrogate key for TASK, not to loose the logic behind; plus I'd put an ined on JonMum only, being Index equal to 1 or 2 only, so not selective.

    The real question is about INFO1 (and 2, 3 etc.) table: should I leave JobNum and Index as PK (consider that the PK of INFo1 will be used as FK for another table), or should I use a surrogate key, like for eaxmple

    C: INFO1 (Info1ID [PK], JobNum [FKb], Index [FKb], ...)

    I don't really like this solution. Actually I'd prefer the following

    C: INFO1 (Info1ID [PK], ...)

    where Info1ID = JobNum + Index (+ = string concatenation).

    Any suggestion?

  2. #2
    Join Date
    Apr 2002
    Toronto, Canada
    Quote Originally Posted by TheGios
    I don't really like this solution.
    why not? it's much better than an artificial concatenation of two values into one column | @rudydotca
    Buy my SitePoint book: Simply SQL

  3. #3
    Join Date
    Jun 2003
    Provided Answers: 1
    Absolutely do not go with the "superkey" approach (concatenating pkey columns into a string). I have neither the space nor time to tell you all the ways this is bad.

    I am about as big an advocate of surrogate keys as there is on this board, but even I don't see the point of adding a surrogate key to table which will not have any subtables. What for? Just set Jobnum and Index as a composite primary key. Though while you are at it, find a better field name than "index"...

    Also, I seriously question why you can't keep Info1, Info2, and Info3 data in the Task table (or combine them as a single subtable). There are a few good reasons for creating schemas with 1-to-1 relationships, but not many. They usually have to do with either space or security issue, and it doesn't appear that you dealing with that here.
    Last edited by blindman; 08-23-04 at 01:31.
    If it's not practically useful, then it's practically useless.

    blindman "sqlblindman"

  4. #4
    Join Date
    Aug 2004
    Ok, let's put it this way...

    A JOB describes the processment of, let's say, a document. Each document can exist in two versions: English and French. Each TASK describes the processement of either the ENG or FRE version, that's way TASK has 1 or 2 TASKs: Index=1 describes the processement for the ENG version, 2 for the FRE. actually, let's rename Index to Version, whose domain is {E, F}.
    Both version are processed in a similiar way, but the FRE may require more steps than the ENG (like translation, for example). So let's say that each TASK is divided into 3 SUB_TASKs: TRANSLATIOM (that is INFO1), TYPING (INFO2) and REPRODUCTION (INFO3). For each TASK there can be 0 or 1 SUB_TASK, and the info for each SUB_TASK can be different for ENG and FRE. Take for example TRANSLATION: the ENG requires this SUB_TASK, the FRE doesn't. Take TYPING: the due date for the FRE can be different from the due date for the ENG.

    So we have the following:

    A: JOB (JobNum [PK], DocReference, StartDate, EndDate, ...)
    B: TASK (JobNum [PK] [FKa], Version [PK], Priority, ...)
    C: TRANSLATION (JobNum [PK] [FKb], Version [PK] [FKb], DueDate, ...)
    D: TYPING (JobNum [PK] [FKb], Version [PK] [FKb], DueDate, ...)
    E: DISTRIBUTION (JobNum [PK] [FKb], Version [PK] [FKb], Copies, ...)

    To complicate things, each SUB_TASK (TRA, TYP and REP) has one or more assignments BUT the assignments for TRA record different information than the assignments for TYP and REP. so we have...

    F: TRA_ASSIGN (JobNum [PK] [FKc], Version [PK] [FKc], Translator, ...)
    G: TYP_ASSIGN (JobNum [PK] [FKd], Version [PK] [FKd], Typyst, ...)
    H: REP_ASSIGN (JobNum [PK] [FKe], Version [PK] [FKe], Pages, ...)

    I wish I could send a pic of the ER diagram...
    Actually, it might be possible to include TRA, TYP and REP into TASK, but I'll have the following...

    B: TASK (JobNum [PK] [FKa], Version [PK], Priority, TranslationDueDate, TypingDueDate, ReproductionDueDate, TranslationReceivedDate, TypingReceivedDate, ReproductionReceivedDate, TranslationCompletedDate, TypingCompletedDate, ReproductionCompletedDate, ...)

    which means a lot of wasted space (a doc may not require TRA and REP, so 6, or more, NULL fields).

    Actually i'm sort of lost... maybe there is another and better way to model this...

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts