# Thread: Candidate Key vs. Super Key

1. Registered User
Join Date
Jun 2009
Posts
6

## Candidate Key vs. Super Key

Hi,

I have a question regarding fundamental database modeling theory when it comes to defining CK and SK. I have been reading on number of sources discussing the difference between candidate key and super key but none has satisfied my hunch on how they actually differ. Let me put it as simply as possible:

Is super key basically a "proper superset" of candidate key?

Before I expound on my understanding of their delicate definitions, let me ask another basic question:

When it is said a combination, or set of, attributes comprise a candidate key, does that mean that each and every single of the attributes "by itself" are supposed to uniquely identify the tuples? Or their "set of combinations," taken together, have to possess such characteristic? For instance, if t1, t2, and t3 are all candidate keys, does a whole set of {t1, t2, t3}, combined has to uniquely identify each tuple or each single attribute, alone, in the relation should identify the rows?

Back to the super key and candidate key; let's assume that a relation has the following attributes, A, B, C, D, E, F, G. Among these attributes, A, B, and D do pass the litmus test in order to be considered as candidate keys. That is to say, C, E, F, and G are functionally dependent on A, B, and D. From the theoretical definition of candidate key and super key, I would define the following FD's:

{A, B, C, D} -> {E, F, G}

{A, B, D, E, F} -> {C, G}

{A, B, D, G} -> {C, E, F}

{A, B, C, D, E, F} -> {G}

{A, B, D} -> {C, F}

Am I right to claim, the first 4 sets are "super key" but only the last set ({A, B, D}) is the "candidate key" which means the super key is basically a candidate key that could contain attributes other than the bare minimum attributes in order to satisfy a functional dependency?

Sincerely,

2. Registered User
Join Date
Dec 2007
Location
London, UK
Posts
741
A superkey is any set of attributes (K) such that no two tuples of a relation can have the same values for K, i.e: t1(K) != t2(K) for every possible pair of distinct tuples, t1,t2. A candidate key is a superkey such that no proper subset of K is also a superkey.

This is often summarised by saying that a superkey is unique whereas a candidate key is both unique and irreducible.

So everything you have said is correct except for your use of the term "proper superset". The attributes of a superkey are always a superset of a candidate key but not necessarily a proper one because a candidate key is called a superkey as well.

Pretty much everyone seems to write the word "superkey" instead of "super key". I don't know why.

3. Registered User
Join Date
Jun 2009
Posts
6
Thank you.

You are right about the "proper superset". It actually makes sense, semantically, to refer to the sets of "super key" and "candidate key" as "super/subset" rather than "proper" because, this way, we can distinguish both while labeling them properly. If it was to be called a "proper super/subset", then super key and candidate key could be equal -- containing exact same set of attributes. But by eliminating the "properness" [sic] among the sets, candidate key becomes unique and minimalistic which can be distinguished from super key.

Just to make sure, I also asked about the taking the set of candidate key as a whole or individually for litmus test of uniqueness among tuples. From your answer, I take it the former should hold true. For instance, assuming A, B are the candidate keys in relation R: {A, B, C, D} and we have the following instances/tuples:

t1: A=a1, B=b1, C=c1, D=d3
t2: A=a1, B=b2, C=c1, D=d3
t3: A=a1, B=b3, C=c1, D=d3
t4: A=a2, B=b2, C=c1, D=d4
t5: A=a3, B=b4, C=c2, D=d4

Obviously, none of the "combination" of {A, B} set (candidate key) is a duplicate of another although various values for the attributes, "individually", are the same for different tuples (i.e. t1.A = t2.A = t3.A and t2.B = t4.B). Considering all that, how are we supposed to pick a primary key among these candidate keys when none uniquely identifies the rows?

P.S. I am new, but I use "super key" as oppose to "superkey" just to be consistance with the same format I use when I write "candidate key" instead of "candidatekey".

4. Registered User
Join Date
Dec 2007
Location
London, UK
Posts
741
Originally Posted by Inlish
Obviously, none of the "combination" of {A, B} set (candidate key) is a duplicate of another although various values for the attributes, "individually", are the same for different tuples (i.e. t1.A = t2.A = t3.A and t2.B = t4.B). Considering all that, how are we supposed to pick a primary key among these candidate keys when none uniquely identifies the rows?
I don't see what you mean. The set of attributes {A,B} appears to be the only candidate key of this relation. The primary key is just one of the candidate keys (not one of the attributes of a candidate key). If there is only one candidate key then there is obviously only one choice for a primary key.

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•