Page 1 of 3 123 LastLast
Results 1 to 15 of 39
  1. #1
    Join Date
    Sep 2004
    Posts
    4

    Unanswered: Do I need recursion ?

    Hi there, Any tips on my problem would be most welcome...

    right, the scenario.

    A web Blog

    blogger1 posts a blog_entry, e.g I love the simpsons
    blogger2 comments on that blog_entry, e.g No, I hate the simpsons
    Blogger3 comments on that comment, ie, How can you hate the simpsons.

    so you can comment on a comment on a comment etc.. lool.

    right, i have got two tables... Blog_entry & comment. i need to be able to search for a blog_entry + all the comments on that blog_entry..

    at the moment i can search for the comments on the blog_entry using the FK in the comment table.


    blog_entry
    INSERT INTO Blog_Entry VALUES(0001,'I love the Simpsons');

    comments
    INSERT INTO Comment VALUES(0001,' No I hate the simpsons, cID 1000);

    but i need to be able to search for the comments on comments

    INSERT INTO Comment VALUES(0001,' How can you hate the simpsons, cID1000);

    hopefully you can see the problem here, with only one comments table how can i get the search for the comment on the comment.. there’s nothing linking them...
    I could make a sub comments table, and use a FK (as with the blog_entry & first comment)

    but then I would have to make another sub sub table to be able to get those comments on the first sub table... this would go on and on for each comment on comment.

    you can see the cID1000 (comment PK) I can't use this to get the comment because its duplicating the PK...

    So, I need to be able to search for the comments on comments… eg. I need to be able to search for blogger2, and any comments that were made on his comments.


    Someone I know mention using recursion to get the comment on comment info, is this right ?

    Hehe, I hope you understand what im asking here… the is my first exploration of SQL, so any tips, hints, would be most welcome….

    Thanks loads…

    PS: if there anything you don’t understand about what I have written, or what im asking… please say so…

    Spence.

  2. #2
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    you could actually do it with just one table

    when you add a comment, add also the id of the original entry

    thus when you add a comment on a comment, the original entry id is available, so you can add it to the comment on the comment too

    then to retrieve all comments, and comments on comments, for a specific entry, it's just a simple WHERE originalentry=5

    so recursion is not required
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  3. #3
    Join Date
    Sep 2004
    Posts
    4
    Quote Originally Posted by r937
    you could actually do it with just one table
    when you add a comment, add also the id of the original entry
    thus when you add a comment on a comment, the original entry id is available, so you can add it to the comment on the comment too
    then to retrieve all comments, and comments on comments, for a specific entry, it's just a simple WHERE originalentry=5
    so recursion is not required
    Hi rudy, thank you very much for your input here.

    using one table, I take it you mean just use the Blog_entry tbl, as the attributes in the blog_entry & comment tbls are the same.. the only difference is the actual text...

    but, If i were to add to original entryId in the new comment tuple (But in the same table) it would break the PK constraint, there would be duplicate blog_entryID's.. So i need to keep both the comments and the Blog_entry tables seperate..

    To able to make a comment on a comment, dont i need something to identify each individual comment ? if so, then im back to square one...

    sheesh, this is confusing....
    Last edited by Spencer_k; 01-08-05 at 14:02.

  4. #4
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    yikes! (can't read it, not sure i need to, though)

    yes you can do it in one table, but you don't have to
    Code:
    create table entries
    ( id integer not null primary key auto_increment
    , name varchar(255) not null
    , unique index entriespk (name)
    , thread integer null 
    , foreign key threadstarter (thread) references entries (id)
    , replyto integer null 
    , foreign key replytothread (replyto) references entries (id)
    , entry text 
    );
    entries that start a new thread have both thread and replyto NULL

    when you reply to an entry, you link to it via replyto, and via thread to the id of the original entry which started the thread

    (so replies to the original new entry have the same id in thread and replyto)
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  5. #5
    Join Date
    Sep 2004
    Posts
    4
    Quote Originally Posted by r937
    yikes! (can't read it, not sure i need to, though)

    yes you can do it in one table, but you don't have to
    Code:
    create table entries
    ( id integer not null primary key auto_increment
    , name varchar(255) not null
    , unique index entriespk (name)
    , thread integer null 
    , foreign key threadstarter (thread) references entries (id)
    , replyto integer null 
    , foreign key replytothread (replyto) references entries (id)
    , entry text 
    );
    entries that start a new thread have both thread and replyto NULL

    when you reply to an entry, you link to it via replyto, and via thread to the id of the original entry which started the thread

    (so replies to the original new entry have the same id in thread and replyto)
    LoooL rudy,

    Cheers for the help m8..

    Sweet stuff!

    Have a good day.

  6. #6
    Join Date
    Aug 2003
    Location
    Delft, The Netherlands (EU)
    Posts
    447

    Lightbulb

    The problem is an hierarchical one, each comment has exactly one parent (except the original thread), and 0 to n childs. The standard approach to store such facts is a recursive one, as shown by Rudy. Such a structure, however, isn't to query using standard SQL.

    Usual solutions for this are
    * recursive T-SQL stored procedure
    * bridging

    Acutally, the solution presented here, is a bridge solution, however, just a bridge to the original thread. To be able to sort on (sub)threads and to format the solution (e.g. by using indents) youwill need to have the complete path. This is, of cource, redundant, but not a problem because such a path will not change,so there is no danger for inconsistency.

    If you can estimate the maximum deepth of a discussion, you can make a structure acoordingly. If you can't, you may also consider to store your information without the path, but dynamically create a temp table with your actual path length and put the data in it for presentation.

    I hope I made myself sufficiently clear.
    Make everything as simple as possible, but not simpler! - A. Einstein
    DB Problems? DB Explorer, BTrieve Re-engineering, DB Conversions & ETL? Conversion Tool

  7. #7
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    the solution presented here did not (yet) include queries, so i don't understand why it's a "bridge" solution -- it is a classic hierarchical solution capable of supporting true recursion

    further, one does not need to store the path (and i have not allowed for it), so the redundancy only comes into the design if you want it to come into the design, and i don't

    further, i am not going to argue whether the thread column (which points from every comment in a thread, no matter at what level, to the thread starter) is redundant, because it isn't

    but i agree whole-heartedly about knowing the maximum depth -- this allows you to write a query consisting of N self-joins, which means that no matter where you are in the hierarchy, you can get all nodes below the node you're on, with indentation, in one single query, without a temp table

    recursion and/or temp tables are only required if you do not know the maximum depth in the entire hierarchy
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  8. #8
    Join Date
    Aug 2003
    Location
    Delft, The Netherlands (EU)
    Posts
    447
    but i agree whole-heartedly about knowing the maximum depth -- this allows you to write a query consisting of N self-joins, which means that no matter where you are in the hierarchy, you can get all nodes below the node you're on, with indentation, in one single query, without a temp table

    recursion and/or temp tables are only required if you do not know the maximum depth in the entire hierarchy
    I agree on this as a third option.

    further, i am not going to argue whether the thread column (which points from every comment in a thread, no matter at what level, to the thread starter) is redundant, because it isn't
    Why not? You can get it by recursively selecting the parent of a comment, or not? However, redundancy isn't the problem here, I would even add the entire path by having 1st level comment, 2nd level comment, 3rd le.... etc. The problem is the presentation of hierarchical data.
    Make everything as simple as possible, but not simpler! - A. Einstein
    DB Problems? DB Explorer, BTrieve Re-engineering, DB Conversions & ETL? Conversion Tool

  9. #9
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    just because you can get the thread starter recursively does not make it redundant

    also, you can achieve indentation without storing the path, by using the N-level self-join
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  10. #10
    Join Date
    Aug 2003
    Location
    Delft, The Netherlands (EU)
    Posts
    447
    Quote Originally Posted by r937
    just because you can get the thread starter recursively does not make it redundant
    Then, with all respect, you have your own, wrong definiton of redundancy, see Definition Redundancy

    But, again, I don't have a problem with redundancy here, since a discussion branche will not be moved, probably. My point is, however, to make either the whole path redundant, or nothing and solve it in this case with other means (N views if you know the maximum deepth, or with a recursive T-SQL procedure). Just to have a reference to the root thread will not help at all.

    Please note, that my proposal to store the whole path, is something similiar to your N-View approach with the difference, that the complexitity of your execution plan multiplies with every additional level, while the complexity of my execution plan is almost liniar growing.
    Make everything as simple as possible, but not simpler! - A. Einstein
    DB Problems? DB Explorer, BTrieve Re-engineering, DB Conversions & ETL? Conversion Tool

  11. #11
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    heh, that's pretty amusing, citing a dictionary as an authoritative reference for a relational database concept

    the reference to the root thread does so help, if you wanted to quickly list all the ids and titles within a thread without regard to indentation

    the complexity of my self-join-N-times does not grow at all, unless you change N, which is assumed to be the maximum depth, and if you change the maximum depth, then it's not a maximum, is it

    i do realize that storing the path has certain benefits, i just do not happen to value them over my method

    for more on paths, read this thread

    in fact, one of the participants in that thread has a tutorial on his method here:

    Multi-Threaded Message Board Solution (Hierarchical Data without the Adjacency Model)
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  12. #12
    Join Date
    Aug 2003
    Location
    Delft, The Netherlands (EU)
    Posts
    447
    Quote Originally Posted by r937
    heh, that's pretty amusing, citing a dictionary as an authoritative reference for a relational database concept
    I hope that - beside the entainment - you got my point: if the same fact (root of a comment) is stored twice (both as [Thread] and [ReplyTo]), it is redundant!
    Quote Originally Posted by r937
    the reference to the root thread does so help, if you wanted to quickly list all the ids and titles within a thread without regard to indentation
    ... and structure. You can, of cource, sort by time of ID to list all comments in chronological order, but you can't help the user to detect the mutual relationships, except by printing IDs and let the user puzzle by himself.
    Quote Originally Posted by r937
    the complexity of my self-join-N-times does not grow at all, unless you change N, which is assumed to be the maximum depth, and if you change the maximum depth, then it's not a maximum, is it
    Please see my previous comment. My point was that the complexity of your preferred solution depends on N.
    Quote Originally Posted by r937
    i do realize that storing the path has certain benefits, i just do not happen to value them over my method
    Okay, we don't need to agree in our recommendations, as long as the poster gets a picture of the possiblities and can choose the one fitting to his needs.
    Quote Originally Posted by r937
    for more on paths, read this thread

    in fact, one of the participants in that thread has a tutorial on his method here:

    Multi-Threaded Message Board Solution (Hierarchical Data without the Adjacency Model)
    I took a look, and storing the comment level is a manner to get at least the indentation. To present the thread with its comment-on-comment rel;ationships, however, this method still needs recursion, while my proposal of storing the whole path is non-recursive.

    I guess we made our solutions sufficiently clear for the poster; I'm curious of Spencers reaction.
    Make everything as simple as possible, but not simpler! - A. Einstein
    DB Problems? DB Explorer, BTrieve Re-engineering, DB Conversions & ETL? Conversion Tool

  13. #13
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Quote Originally Posted by DoktorBlue
    if the same fact (root of a comment) is stored twice (both as [Thread] and [ReplyTo]), it is redundant!
    Thread and ReplyTo aren't the same fact!!

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  14. #14
    Join Date
    Aug 2003
    Location
    Delft, The Netherlands (EU)
    Posts
    447
    Either you or me are making a considerable thinking mistake. Please see the following example, just getting three fields: ID, Thread, ReplyTo

    1,null,null
    2,1,1
    3,1,2

    In the second row, Thread=ReplyTo
    In the third row: Thread = ReplyTo of ReplyTo

    In both cases, Thread can be derived from ReplyTo, so it is stored twice, which is called redundant storage.
    Make everything as simple as possible, but not simpler! - A. Einstein
    DB Problems? DB Explorer, BTrieve Re-engineering, DB Conversions & ETL? Conversion Tool

  15. #15
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    please apply 1NF, 2NF, 3NF rules to this example: (ID, Thread, ReplyTo)

    i'd be interested to know when it violates an actual normalization rule, not a doktorblue "it looks redundant to me" normalization rule
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •