Results 1 to 3 of 3
  1. #1
    Join Date
    Mar 2008
    Posts
    11

    Never modifying data: Are pointers a good solution?

    I run a website that stores a lot of user-generated data. Currently, I allow them to modify records. Say someone rates a product as 3 out of 5 stars today. Tomorrow, they can change that to 5/5. When they do so, the record is simply updated.

    I'm thinking that I'd rather have all of the data - to know the data's history - and I'm considering different ways of doing this. One way that I'm considering is to use a "pointer" (for lack of a better word). I'd insert the new rating into the table, snag its ID, and then update its parent to add the child's ID into its previously null "pointer" field. An example of one item that has had its rating modified is below:

    Code:
    ID | Product | Rating | childID
    1  |  Flowers  |  3  |  2
    2  |  Flowers  |  5  |  NULL
    The reason I like this approach is that the freshest data can be isolated by searching for all records where childID is null. However, are there other general approaches to data persistence that might be more efficient? Any feedback would be greatly appreciated.

  2. #2
    Join Date
    Feb 2009
    Posts
    8
    Quote Originally Posted by nineinchnailgun
    I run a website that stores a lot of user-generated data. Currently, I allow them to modify records. Say someone rates a product as 3 out of 5 stars today. Tomorrow, they can change that to 5/5. When they do so, the record is simply updated.

    I'm thinking that I'd rather have all of the data - to know the data's history - and I'm considering different ways of doing this. One way that I'm considering is to use a "pointer" (for lack of a better word). I'd insert the new rating into the table, snag its ID, and then update its parent to add the child's ID into its previously null "pointer" field. An example of one item that has had its rating modified is below:

    Code:
    ID | Product | Rating | childID
    1  |  Flowers  |  3  |  2
    2  |  Flowers  |  5  |  NULL
    The reason I like this approach is that the freshest data can be isolated by searching for all records where childID is null. However, are there other general approaches to data persistence that might be more efficient? Any feedback would be greatly appreciated.
    Hey,
    this is the same as the solution that that I came up with for my project. What you do is use a reflexive relationship to track the previous values of rating, i.e. a kind of a singly-linked list. Null indicates the latest rating.
    I am going to use it to keep track of the order in which tasks need to be performed.
    I am not an sql guru like most of the guys here, so I couldn't give an advice based on my experience, but the following article is by a bloke who seems to know what he is doing.

    The only problem I think is that when done this way, the table is no longer in 3NF because all non-prime attributes will depend on on Id as well as ChildId, which violates the "nothing but the key" rule. In my design I have:
    Code:
    table 1:
    ID | Product | Rating |
    1  |  Flowers  |  3  |
    2  |  Flowers  |  5  |
    
    table 2:
    ID{pk, fk->table1.ID} | ChildId{fk->table1.ID)
    Best

  3. #3
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    Quote Originally Posted by nineinchnailgun
    Any feedback would be greatly appreciated.
    A few points:
    • You're missing a user_id to show who the rating belongs to.
    • Navigating the childIds would be a pain. Imagine writing some code to extract all the rating history for a product by a user using this childId field. I personally wouldn't use linked lists at all. Instead ...
    • You are also missing a time field so at the moment you can't tell when user changed rating. You could then use the latest date to find the current record and the order of all the records before it.
    • Using null to mark the latest field works in some databases and not in others ie the optimiser might look at your query and say that, seeing as you don't know the value for the field, then there's no point using an index based on that field. If this is the case with your RDBMS then just use a set value ie "1st jan 3000".
    Mike

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •