Results 1 to 9 of 9
  1. #1
    Join Date
    Aug 2001
    Location
    London, UK
    Posts
    31

    Unanswered: The hierarchical query problem... Oracle-esque CONNECT BY PRIOR

    This has been forever on the MySql TODO list it seems.
    And yet again I find myself needing to represent tree structures of unknown depth in MySql.

    For those not familiar with this problem, consider a database table representing a a docroot:

    TABLE: Directory
    COL1: Directory_ID
    COL2: Directory_Name
    COL3: Parent_Directory_ID

    Parent_Directory_ID is only ever NULL for the root directory... e.g. C:\
    C:\ Has a Directory_ID of 1
    C:\Windows is Directory_ID = 2, Parent_Directory_ID = 1.
    C:\Windows\System32 is Directory_ID = 3, Parent_Directory = 2.

    And so on... it's not hard to encapsulate a whole docroot in one table... this isn't what I want to do, but it illustrates the concept.

    Now... this isn't a rare type of representation of data... so I guess that a lot of you have encountered this too... what I would like is to know how you have all surmounted the issue... what solution did you implement to get round this limitation that could help me design my schema to fit it.

    Cheers

    David K
    Last edited by buro9; 01-19-03 at 19:22.
    My homepage:
    http://www.buro9.com/
    My work:
    http://www.btopenworld.com/
    http://www.officialfootballsites.co.uk/
    http://www.jeepster.co.uk/

  2. #2
    Join Date
    Sep 2002
    Location
    Kyiv, Ukraine
    Posts
    77
    Didn't quite understand what the actual problem you encountered?

    I've worked with such data structures in MySQL and didn't have any problems with it. Please specify more detailed what's the problem that you encountered, what "limitation" do you mean?
    Yours faithfully,
    Yaroslav Zaremba

  3. #3
    Join Date
    Aug 2001
    Location
    London, UK
    Posts
    31
    OK, Suppose this is my table:

    PHP Code:
    ID  PARENT  NAME
    1   NULL    C
    :\
    2   1       Windows
    3   1       Program Files
    4   2       System32
    5   2       Drivers
    6   5       Etc 
    Now suppose given the root (C:\) I want to see my tree as this:
    PHP Code:
    C:\
      
    Progam Files
      Windows
        Drivers
          Etc
        System32 
    How would I, in as few queries as possible, create that tree?

    Should I use nested (lft, rgt), adjacent (parent), path (1,2,5) or another system to define the tree such that it can be returned in ideally one query... bearing in mind that MySql cannot do a CONNECT BY PRIOR treewalk, and that I cannot use a sub-select... this is what I refer to as the limitations of MySql... to not be able to perform a recursive query.

    Now, knowing that I cannot recursively query, and that numerous work-arounds exist by which you can represent a tree and manage the data... What should I use? What do you use? What are the pros and cons inherent in each work-around? What is the most common work-around? What would be best practice work-around in a normalised database?

    Hence, I would like a full line of inquiry into handling hierarchical queries.
    My homepage:
    http://www.buro9.com/
    My work:
    http://www.btopenworld.com/
    http://www.officialfootballsites.co.uk/
    http://www.jeepster.co.uk/

  4. #4
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    What should I use?
    What do you use?
    What are the pros and cons inherent in each work-around?
    What is the most common work-around?
    What would be best practice work-around in a normalised database?

    whoa, that's a lot of questions

    solution: self joins to a level of three or four deep
    advantage: single query
    disadvantage: user must click to explode deeper subtrees
    advantage: not exploding entire tree saves resources

    i've seen applications that use a limit of three or four or some similar number, deep enough to cover actual trees "in the wild" as it were, even if hypothetically the tree could be deeper, but if it is, there's a flag or some other visual clue (like that little plus sign in windows explorer) that shows there is a subtree that can be revealed by re-querying

    as far as work-arounds go, there really isn't an equivalent, because that would mean the entire tree comes out of the database in one call, which implies n-level recursion like oracle's, which no other database has

    so if you cannot get it all in one query, you have to use more than one query, and going 3 or 4 levels deep starting at whatever node you're sitting on is a good compromise

    calling one record per query ("query in a loop") and then building the entire tree in a script array is typically the way php programmers approach the problem, but this is, in my mind, the worst practice

    as for that celko nested set solution, i still haven't figured it out for myself, but more important, i have no idea whether you can get the entire tree fully exploded in one call


    rudy

  5. #5
    Join Date
    Sep 2002
    Location
    Kyiv, Ukraine
    Posts
    77
    r937 is right - self joins should do what you want. But I still not quite sure with the problem - what final recordset you would like to get from your example-table? Write here the desired fields and I'll try to help you but as for now I still do not see what difficulties you meet ...

    Even though MySQL doesn't support sub-selects there are usually a lot of methods to avoid them. And the last one is by using temporary tables with TYPE=HEAP.
    Yours faithfully,
    Yaroslav Zaremba

  6. #6
    Join Date
    Aug 2001
    Location
    London, UK
    Posts
    31
    Well, I'm representing several trees in the database although the core ones are:
    1) A docroot
    2) The DOM of a web page

    The docroot isn't so bad, this is mostly to assist with the construction of hierarchical navigation and 'bread crumbs' (such as the link above saying 'dBforums Database Server Software MySQL".

    This isn't so bad because the recursion is quite limited... websites don't usually descend more than 3 in depth, 4 at worst... beyond that there is a design flaw.

    What is bad however is the mapping of the DOM. The DOM (Document Object Model) is the full structure of the HTML of a page. It may seem a bit barmy to have our HTML structure in the database... but we have got a working schema in Oracle that allows us to model a full page as components and then stores it all in the db... this allows us to swiftly move everything from one side of the page to the other by re-assinging its parent.

    So, recursion on a single page may look like:
    PHP Code:
    HTML
      HEAD
        SCRIPT
      BODY
        TABLE
          TR
            TD
              TABLE
                TR
                  TD
                TR
                  TD
            TD
          TR
            TD
              TABLE
                TR
                  TD 
    Some of those are already 8 deep... and complex pages require further nesting of elements (be they tables or divs).

    We've played with a few experiments and came up with the following conclusions:

    nested (lft, rgt): Celko's idea seems easy to query, but unreasonably difficult to manage. The extra effort in maintaining the lft, rgt values and managing them is too much of an overhead and doesn't lend itself as friendly easy to maintain code.

    path (1,2,5): path seems to be the most popular performance hack there is... even vBulletin (the software this board is running on) use it liberally. The maintenance is considerably easier, but the polluted schema and still having to manage the hierarchical data seems to be short-sighted... This modelling doesn't favour offering ports whereby you can have a single good schema work efficiently in other db's.

    adjacent (parent): ok, so you can't do the recursion in the sql... but the model is still good and no maintenance is required. For the docroot, the data changes seldom, so returning everything and supplying a persistence layer for the resultset avoids having to re-query often... processing can be done very simply in OO languages, and even procedural recursion is not difficult. For the DOM model, we will add to the schema a way to identify all usages of DOM elements on one page, and again return all applicable results in one hit and leave recursive processing to the script/language that asked for it.

    Effectively I am conceding that the model is stronger (management, maintainability and portability) than the performance aspects (single hit query and no additional recursive processing by the script language). To counter the performance loss and the additional script overhead we will use a persistence layer with a method of invalidating the persisted data whenever the data changes... for the most part our application will not hit the database for any structural or layout data... that is, until that data changes (the script that inserts/updates the data will invalidate necessary dependencies).

    Thanks r937, very helpful comments... you must be able to tell I'm too used to Oracle I don't want to taint a good model with a bad hack, so I'll stick with the model and transfer the processing to the script... when we do an Oracle port the treewalking will go back in
    My homepage:
    http://www.buro9.com/
    My work:
    http://www.btopenworld.com/
    http://www.officialfootballsites.co.uk/
    http://www.jeepster.co.uk/

  7. #7
    Join Date
    Sep 2002
    Location
    Kyiv, Ukraine
    Posts
    77
    Problem appeared to be more complex than I've seen it at the first sight. Can't say anything except as making queries in a "loop" for getting the required result ....

    Also will be very interested to hear possible solutions to the problem.
    Yours faithfully,
    Yaroslav Zaremba

  8. #8
    Join Date
    Mar 2009
    Posts
    1
    Could you use something like
    Code:
    select dirtree(root) from dual;
    , where dirtree is a recursive function taking a unique directory identifier and returning the record set containing all descendant nodes?

  9. #9
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    l0b0, the answer is no

    thanks for digging up a six-year-old thread
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •