Results 1 to 2 of 2
  1. #1
    Join Date
    Mar 2004

    Unanswered: Unknown / Null Foreign Keys


    Can anyone fill me in on what the latest thinking is on unknown / null foriegn keys, in a standard data model ? Obviously if I use 'null' as the foreign key, when I join ( normally ) to the associated table, I will miss out on the original row as there is no corresponding row ( let alone not being able to join on null ). I was thinking of making the value 0 or -1, and then placing a row in the associated table with a corresponding value which means 'unknown value', so any normal joins will always work.. This will save me having to do outer joins all the time too...

    But my questions is, firstly, do you think this is the way to go, and secondly, is there a standard value to use ( 0 or -1 ) ? Ive heard Ralph Kimball say that 0 should be avoided as often 0 can actually represent something meaningful in data modelling terms, but was also unsure if -1 would cause joins to be slower, or cause any problems if I wanted to partioning on this key, etc. ?

    Are there any Oracle DBAs out there with an opinion on this ?


  2. #2
    Join Date
    Sep 2002
    Provided Answers: 1
    Most Oracle designers would opt for a null for missing data, although nulls do have their issues and are not relationally pure. It is certainly a good idea to avoid nulls where possible: an obvious case being a NUMBER value like salary, where 0 is a perfectly good alternative in many cases. I'd much rather have a zero in the inapplicable salaries than sprinkle NVL(salary,0) throughout my SQL.

    I have never myself avoided a null foreign key value in the way you suggest, though. One possible issue would be that if there are a lot of "null" entries, then a query like "select * from emp where deptno=0" would not benefit from the index on deptno. You would need to create a histogram so that the optimizer knew a full table scan would be better.

    As for whether to use 0 or -1 or whatever, I would say that is your decision based on what makes sense for the domain of the column. If the column is based on a surrogate key populated by a sequence, then either 0 or -1 sounds fine; if it is a user-assigned code the a special code like 'N/A' might be more appropriate.

    As for partitioning, again if the foreign key is frequently "null" then it probably wouldn't be a good candidate for the partition key anyway!

    Unless someone else can contribute with practical experience of doing this, I suggest you need to test this idea thoroughly with realistic data volumes before committing to it.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts