As a software developer, I have a solid understanding of relational databases and normalization and have designed many. But what is the best or most pragmatic answer to the whole debate of 'when to use the real candidate key, when to use an identity PK', and the byproduct concerns around joining on keys that are strings (varchar) as opposed to an identity?
The advantage of natural keys is that they exist already, you don’t need to introduce a new “unnatural” value to your data schema. However, the disadvantage of natural keys is that because they have business meaning they are effectively coupled to your business: you may need to rework your key when your business requirements change.
Personally I have been burned more than once by choosing a natural key only to have the business change later on. My rule of thumb is to go with surrogates unless it can be guaranteed that the natural will not change definition.