I wanted to set up so IF the user were to look under subcategory1, they would see ALL the items that belonged to all the subcategories below it.
I decided to use a 'Modified Preorder Tree Traversal' system, and when each item is entered into the database, the 'left' and 'right' positions of the category are marked in the 'Items' table along with the other details of the item.
This makes it easy to retrieve items that are below any particular point in the tree.
BUT if I were to add categories to the tree, the 'left' and 'right' values of some categories would be affected and the whole items database would have to be updated to suit. (which isn't very scalable IS IT??)
After thinking the 'Modified Preorder Tree Traversal' system was a great idea to save big queries, I'm now wondering if this is the best route to take.
Ok, I've read into that topic and discovered wha I was calling 'Modified Preorder blah blah' has the colloquial name of 'Nested Sets'
I reckon I will stick with this then but not store any 'lft' or 'rgt' along with the item for query speed. Just joining the items table with the categories table to get the 'lft' and 'rgt' each time a query is made will have to suffice.
It'll certainly speed things up if I end up with 1m items and a new category to put in. Saves updating 2m values :-)
Apologies to Joe Celko, but I would discourage the use of nested sets except in specific instances. A normal "adjency" modeled tree structure is usually sufficent and easier to implement.
Your ultimate decision, though, may be based upon the data platform that you use. SQL Server 2005, for instance, has implemented a new concept called Common Table Expressions which is intended to help traverse adjency models, and Oracle has had similar functionality for quite some time. I am not aware of any database platforms which have features for supporting Celko's Nested Set model.
I can also tell you that the one time I have actually seen the Nested Set model implemented it was elegantly done, beautifully designed, meticulously documented, and still left the application developers clueless as to how to code around it.
If it's not practically useful, then it's practically useless.
well I chose the nested set idea for the reason that I can retrieve information with one simple query....
If this were the tree I was working with (items.gif)
...and I wanted to retrieve all the items that fall into 'category 1 (and therefore sub category 1 & 2)', I could do a query as simple as 'select items WHERE lft > x AND rgt < y'. I can't see how I could access that so easily using the adjacency model. I could only see me doing several queries to get the same results.
The reason why I use the adjacency model (parents and children) is because it pretty much works for anything and everybody understands it. We have a bunch of code at my current gig using left and right nodes and nobody dares touch it.
If performance is an issue then just build a cache table containing all the child ids for each parent id. Then if you want to find all the children under any node you just
select child_id from CachedTable where parent_id=x
With your nested sets you'll still have to cache the tree to set your left and right values. The pain will come when anyone else might want to understand it or if you want to have children under multiple parents etc.
Obviously you've read the info and it's up to you but ...