If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > MySQL > Friends Network (undirected graph) Database Design

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 10-10-06, 12:11
sideral sideral is offline
Registered User
 
Join Date: Oct 2006
Posts: 2
Lightbulb Friends Network (undirected graph) Database Design

Hi. I see three possible ways in which a friends network database can be designed. Each way has its own advantages and disadvantages. Each one leads to very different querys and performance. There aren't many places where this is discussed, although I think this is an interesting problem. I would like you to give me your opinion on which one you think would be better.

Excuse my lax sytax, i just want to highlight the important points.

The first way

TABLE user(
user_id int primary key
)

TABLE Friend(
user_id int
friend_id (foreign key references user.user_id)
)

In this design, the Friend table saves who befriended whom, i.e. if user 1 befriended user 2, in the Friend table, one row would be added: 1 2.

To insert a friend this would be the query: INSERT INTO friend VALUES ( 1, 2 );

To get the list of friends of user 1, this query would work:

(SELECT friend_id FROM friend WHERE user_id = 1)
UNION
(SELECT user_id FROM friend WHERE friend_id = 1)

The problem with this approach is that it gets difficult to know, for example, who are friends of user 1, since user 1 can be on the first column or in the second column (see the query). Similarly, all CRUD operations become difficult, requiring unions or the use of OR in where clauses, which sometimes mysql doesn't optimize very well.

The second way

It is similar to the first way except that for each friend relationship that is created, two rows are added to the friends table. In the preceeding example, the rows 1 2 and 2 1 would be added, reflecting the fact that user 1 is friend of user 2 and, also, user 2 is friend of user 1.

To insert a friend this would be the query: INSERT INTO friend VALUES ( 1, 2), (2, 1)

To get the list of friends of user 1, this would work:

SELECT friend_id FROM friend WHERE user_id = 1

In this design, it's easy and fast to get the list of friends for any user. However, one can say that there is duplicated data, i'm not sure if this can lead to further problems. This design highlights the nature of friendship, where if user 1 is user's 2 friend, the revense is also true.

The third way

The friend table would look like this:

TABLE friend(
friendship_id int,
user_id int (foreign key references user.user_id)
)

Here two rows are added for each pair of friends. The friendship itself gets its own id, but it is not a primary key, since it would be repeated twice. For instance, if user 1 and user 2 were friends, this would be represented like this:

friendship_id user_id
124 1
124 2

And the same for each pair of friends. This may lead to entangled querys, but with good performance.

To insert a friendship between user 1 and 2 this would be the pair of querys:

SELECT @max_friendship_id := MAX(friendship_id) FROM friend;
INSERT INTO friend VALUES( @max_friendship_id + 1, '1' ), (@max_friendship_id + 1, '2')

To find the friends of user 1, this would do the job:

SELECT user_id
FROM friend where
friendship_id IN (select friendship_id from friend where user_id = '1') AND user_id != '1'

To get the list of friends of user 1, one would need to do a nested query (bad performance in mysql), or I suppose this could be rewritten as a self-join. I think this design could be good if there is some extra information associated with the friendship itself, since each friendship has its own id. Also, maybe, this is the most 'relational' of the three designs.


Well, these are the possible designs i see, but I don't get to decide which would be the best. This problem, in fact, isn't unique to social networking , but this question is the same as how to represent an undirected graphs within a relational database, for which I haven't found a clue in database design texts.

I hope I stated the problem well. Thank you for your contributions and comments!

Last edited by sideral; 10-10-06 at 14:06.
Reply With Quote
  #2 (permalink)  
Old 10-10-06, 13:05
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,534
excellent synopsis, and very nicely discussed

my choice would be the second way

why? because suppose you suddenly decide that it does matter who befriended whom -- what would be the impact on your queries for each of the three ways?

the first way, for undirected, means complex queries which would have to be un-complexed

the third way is impossible

the second way means no change whatsoever
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #3 (permalink)  
Old 03-01-10, 23:55
thenobot thenobot is offline
Registered User
 
Join Date: Mar 2010
Posts: 1
Here is another option for finding friends of a given user...

Given the table "users" which is the master list of users, and the table "friends" which has two foreign key columns user1_id and user2_id, either of which can be the user in question (id=123):

SELECT * FROM users JOIN friends ON users.id = IF(friends.user1_id=123, friends.user1_id, friends.user2_id) WHERE (friends.user1_id=123 OR friends.user2_id=123)
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On