I am trying to come up with an algorithm for the following problem:
I have a table of user_groups and a table of user_groups_heirarchy.
Each user_group may consist of other user_groups which are considered to be its sub_groups (no cycles allowed of course). The information about the sub groups of each group appears in the user_groups_heirarchy table.
The table of user_groups_heirarchy has 2 columns: parent_user_group_id and child_user_group_id (which are both foreign keys to the user_groups id column).
So, the user groups and the relations between them are essentialy a directed acyclic graph (DAG).
Given a user_group id, I want to find all its descendents (that is, all the groups which are its sub_groups, whether directly or indirectly). This is to be done of course with minimum calls to the db.
I have no problem of holding redundant data and additional tables for this task. It just needs to be done as efficiently as possible. Does anybody know of an algorithm for this problem? All the regular DAG algorithms are very efficient when done in memory, but require too many db accesses.
Joe Celko cracked that nut a long time ago, the solution was written up in DBMS Magazine. His solution only requires a single trip to the database, no matter how many levels of nested sets you need to retreive.
Thanks Pat for the help, but I already tried using the algorithm you suggested.
The problem is that this algorithm is good for trees, and what I have is a DAG (essentialy meaning that a specific user_group may belong as a son to more than one user_group).
I didn't manage to think of a smart way to extend this algorithm for DAGs.
If you have any ideas I would be very glad to hear of them.
Each row (node) of the structure that Joe talks about is actually a non-crystaline DAG element. Include a link (foreign key) into another (essentially heap structured) table, and voila! You can easily represent any data item from the heap at more than one node. A bit abstract, but fast and relatively simple.
I lost you a bit...
First of all, I don't really know what's a non-crysaline DAG element and what's a heap structured table. Second of all, I didn't understand exactly how to extend the solution to a DAG.
According to the solution for trees, I have exactly one row for each node. Since each node has at most one incoming edge, that's enough to represent the whole structure. So how do I confront this with multiple incoming edges?
Are you suggesting to "duplicate" each node (and make a copy of it for each incoming edge) to essentialy recieve a forest? And if so, how are you suggesting to do so?