I am new to database and have a unique situation with regards to the design of tables that require nested query "look-ups." Let me explain my problem in more detail that may help the readers have a better understanding of the situation.
I am trying to design a blogging system that each "entry" can be assigned a set of "tags." For instance, an entry can have a set of categories associate with it: tagA, tagM, and tagZ. Now each tag can also contain a "set" of _subcategories_: tagM can have tagMa, tagMb, and tagMc subcategories which are related semantically to tagM, for whatever reason. This subcategori[fication] can go further deep by having tagMca, tagMcb, and so on. Think of it as a tree where every node can have a number of children.
I can create a table with tagID: unique for each tag, tagSuper: a tag that is a root to this node tag, if any. Now, as for the queries, when I seek to pull entries that have tagM associated with them, I would like to query entries with tagM and all its children nodes with no depth limit, that is tagM[a-c], tagMc[a-b] (5 subcategories + 1 tagM itself) in our example.
My question is, first, how to construct such query, second whether such approach is sound or should be banned altogether in favor of other designs?
what you should consider is joe celko's nested set model which can return an entire subtree to any depth
if it's me, though, i would associate an entry with a set of tags, same as you, but i wouldn't attempt to return the subtree for that category -- if the entry is related to any subcategories, then i would link the entry to those subcategories, and if this means associating it with all the subcategories, becuae they're all relevant, then i would associate the entry with all the subcategories, because then i can continue to use the adjacency list model which is, to me, anyway, a lot simpler to understand and implement
Thank you for your prompt response. I think I've gone a bit over my head on this topic which begs for more of advanced DB designers. From what I have gathered on the algorithms and set models you mentioned, I can see a big hurdle of issues popping out if I seek to follow my own requirements. For example, what will happen if I desire to delete a certain tag that has a nested set of categories hanging from it?
I am sure these models answer such modifications to the set but as you mentioned, the implementation would be arduous. And what layer should be responsible for such task, the DB in procedural functions or at the application level? I agree that I should be dealing with set of categories associated with each entry rather dealing with a chunk of the graph. I most definitely would like to read on these topics and have a better understanding of representing trees and hierarchies.
That would reflect my own conclusion as well to avoid it in the first place but just to raise a cruel example, which I am sure you can present a counter argument against: ORM (Object/Relational Mapping) tag which can go under either OO and DB type [super]-categories. I understand that this is a slothful induction but I hope you get the picture. In either case, after some consideration, I believe for now I should stick with a simple id-value table and avoid this whole inheritance complexity although it would make a rewarding puzzle-solving algorithm.
Again, thanks for your guidance as it shed some light on a few topics I wasn't aware even existed. I also managed to fine a book title by Mr. Celko on such subjects: Trees and Hierarchies in SQL for Smarties.