Symmetric Self Referencing Many to Many Relationship
I've got this problem i've been mulling over for ages and can't seem to come up with an answer for it:
I want to create a db that stores country information so i'll probably have a country tables as follows:
countryid countryname etc
I also want to store information about what country borders another country. To my mind this is a self referencing many to many relationship so i create a borders table to resolve the many to many as follows:
the problem with this is the relationship borders is symmetrical ie. if france borders germany then germany should also border france, yet with this design france can border germany without germany bordering france.
any thoughts, ideas, comments, etc. would be most welcome.
In your join table you will have columns for CountryA and CountryB.
Which table comes first can be arbitrary, though if you use Rudy's suggestion of enforcing a constraint against CountryB ever being a lower value than CountryA you can prevent duplicate relationships such as:
Record CountryA CountryB
1 USA Canada
2 Canada USA
3 USA Mexico
...not that such duplicates would be a huge issue, as long as you UNION the table to itself to get the entire set of relationships from either direction, and use this UNION view in all your code joins:
select CountryA, CountryB from CountryNeighbors
select CountryB as CountryA, CountryA as CountryB from CountryNeighbors
This result set would effectively show you all bordering countries for every country.
If it's not practically useful, then it's practically useless.
If you just want to enforce the constraint that borders must be symmetrical ...
that's a cute trick, but it more or less locks you in to double entries
(by the way, in which database systems would inserting the (1,2) fail because the (2,1) isn't there yet? does this technique require suspending FK checking until after the transaction? this is getting messier by the minute...)
Another way to do this is to name or number each border and treat the border itself as a set (ie. the set of adjacent countries):
CREATE TABLE Borders (BorderNum INTEGER NOT NULL, CountryNum INTEGER NOT NULL, PRIMARY KEY (BorderNum, CountryNum));
This has the possible advantage that there is only one column in which to look for a country.
i can see that this would solve the problem but it kind of feels a bit wrong - a bit too border centric - i'm probably mixing the db design with the end application here - but this design feels like it would force me to be adding countries to borders, whereas i'd envisaged adding a country neighbours to countries, if you see what i mean?
One of the goals of normalization in database design is to reduce or eliminate duplicate data. (If you have duplicate data, what happens when you update/delete the first row, but forget to update/delete the second??? After the change, if you don't know the answer already, how do you then know which row is correct?)
After all, it's very easy to add a check constraint to the table, so that CountryA must be less than CountryB. (and, in a case like this, where once entered, the table data will change rarely, it's not as if there would be a performance hit...)
Last edited by loquin; 04-21-10 at 18:32.
"Lisa, in this house, we obey the laws of thermodynamics!" - Homer Simpson
"I have my standards. They may be low, but I have them!" - Bette Middler
"It's a book about a Spanish guy named Manual. You should read it." - Dilbert