Results 1 to 13 of 13
  1. #1
    Join Date
    Jul 2009
    Posts
    168

    Data consistency issue

    Hi all,

    I am currently in the design phase of a project and concerned about an issue with data consistency of my database over time. I have a number of bus coaches running on a number of assigned routes. My design for the routes table, at first guess, would be something like :

    Route
    ------

    id name description date price
    1 aRoute a-b-c 1/1/13 100
    2 bRoute a-d-e 1/1/13 200

    These routes id would then be assigned to bus coaches table to know which route they undertook. Now, suppose in the middle of the year, someone amends the Route table's description so that aRoute is now 'a-e-f' instead of 'a-b-c' and if I want to track the history of routes done by coaches, how do i do that because then the data will be corrupted if I refer to the routeID. Do I need another table to keep track of everything? Any advice on this please? Thanks

  2. #2
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    The logically simplest way to handle this is to add an effective_date to each row. The downside to this approach is that you have to do a bit of trickery every time you need to get the appropriate row for a given point in time. It is easy to build a view that will return the current row based on the system time, but that isn't always what you need.

    Even if you create an "archive" table, it doesn't really help much. The current row is easy to find in the main table (just like in the view I mentioned above), but you still need to find the appropriate historical row and then you have to deal with multiple tables to find it!

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  3. #3
    Join Date
    Jul 2009
    Posts
    168
    Thanks for reply but I admit it is still a puzzling thing to me on how to implement the design. Anyone can give some examples because i would guess that is not a new thing for systems that need to keep track of previous records so that reports can be issued later.

  4. #4
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    I'm going to give you my best guess at what your columns contain. Please correct any errors, then we can proceed.

    id is an identifier for the route. A unique number that you assign as you see fit.

    name is how the driver or the rider would identify a given route. It is presumed to be unique, but that may not alwasy be true.

    description is a technical description of the route, useful to route planners/operators but probably not to drivers or riders.

    date is what you do with your wife on a Saturday night.

    price is how much the date costs you.

    Please correct any of my definitions that are in error, then we can proceed.

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  5. #5
    Join Date
    Jul 2009
    Posts
    168
    This is very correct and particularly liked the 'date' meaning

    ID: a primary field
    Name: name identifier for route

    Description: description of the route, also important to both operators and drivers when it comes to pay fares and issuing receipts to drivers etc

    The rest is right. Thanks. I was reading on the stuff and I came across :

    Using Triggers to Track Database Action History - 4GuysFromRolla.com

    This was what initially I had in mind that history tables need to be set up.

  6. #6
    Join Date
    Jul 2009
    Posts
    168
    My main concern in tracking history of things is that say at the end of the year we need a report for each driver and hence we would need the routes undertaken. However, if the application user modified the route description from say 'london-croydon-swindon' to 'london-croydon-brighton', then the report would be incorrect. As the link above suggests, we would need to create history tables for each table but I would like to know how to implement the best design normally done for it.

  7. #7
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    First off, if you need to report to a regulatory agency then a whole different set of rules comes into play. When dealing with regulatory compliance, there is no "wiggle room" and the data needs to be auditable. In this case, you need a history table that records every trip for every driver and possibly another table to record every trip for every vehicle. These tables need to record every detail about every trip that is required for regulatory compliance. If you choose to ignore this advice, you will certainly come to rue the day you made that choice!

    Adding history tables just complicates the issue of dealing with this kind of information in exchange for making exactly one type of query perform better. While this is an OLTP database instead of a DW, you can search the web for the term "Slowly changing dimension" to find many discussions of how this can be done effectively and efficiently.

    Let me know if you need an example. I can create one relatively easily if necessary.

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  8. #8
    Join Date
    Jul 2009
    Posts
    168
    Thanks Pat for the reply and for the advice. I will make sure we have those history tables implemented. An example would be most welcome of what you mean by "Slowly changing dimension". Thanks.

  9. #9
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    This is an extreme example, but note only one table (no separate history table) and that by changing the date constant inside the SELECT statement you can get the appropriate rows for that date.
    Code:
    CREATE TABLE kpeeroo (
       route_id         INT
    ,  route_name       VARCHAR(30)
    ,  description      VARCHAR(30)
    ,  effective_date   DATE
    ,  price            INT
       )
    
    INSERT INTO kpeeroo (route_id, route_name, description, effective_date, price)
       VALUES
          (1, 'aRoute',    'a-b-c', '2012-01-01', 100)
    ,     (1, 'aRoute',    'a-b-q', '2012-03-01', 110)
    ,     (1, 'aRoute 01', 'a-b-q', '2012-05-01', 120)
    ,     (1, 'aRoute 01', 'a-b-t', '2012-07-01', 130)
    ,     (1, 'aRoute 02', 'a-b-q', '2012-09-01', 140)
    ,     (1, 'aRoute',    'a-b-c', '2012-11-01', 150)
    ,     (2, 'bRoute',    'a-d-e', '2012-01-01', 200)
    ,     (2, 'bRouteZ',   'a-d-f', '2012-02-01', 205)
    ,     (2, 'bRoute',    'a-d-e', '2012-04-01', 210)
    ,     (2, 'bRouteF',   'a-d-g', '2012-06-01', 260)
    ,     (2, 'bRouteG',   'a-d-e', '2012-08-01', 270)
    ,     (2, 'bRoute',    'a-d-e', '2012-10-01', 280)
    
    
    SELECT *
       FROM kpeeroo AS a
       WHERE  a.effective_date = (SELECT Max(z.effective_date)
          FROM kpeeroo AS z
    	  WHERE  z.route_id = a.route_id
    	     AND z.effective_date < '2012-07-15')
    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  10. #10
    Join Date
    Nov 2003
    Posts
    2,935
    Quote Originally Posted by Pat Phelan View Post
    Adding history tables just complicates the issue of dealing with this kind of information in exchange for making exactly one type of query perform better.
    that depends on the DBMS being used.

    In Oracle 11 you can create a "FLASHBACK ARCHIVE" which gives you the ability query historical data without hassle and without any read overhead on the base data. Of course it comes with a hefty price tag, but then developing such a solution isn't free either.
    I will not read nor answer questions where the SQL code is messy and not formatted properly using [code] tags: http://www.dbforums.com/misc.php?do=bbcode#code

    Tips for good questions:

    http://tkyte.blogspot.de/2005/06/how...questions.html
    http://wiki.postgresql.org/wiki/SlowQueryQuestions
    http://catb.org/esr/faqs/smart-questions.html

  11. #11
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    shammat, maybe I didn't make myself clear. Adding history tables or a flashback archive (or in this case many archives) only helps one specific query perform better. Having history in another table helps the case of "what is the current value", but actually seriously hurts "what is the value as of date X".

    The Oracle flashback archive is a neat feature, and would solve this problem with a minimum of code. The performance hit for using flashback archives seems to be pretty high, and they are very advanced technology for a newcomer to the database world. Oracle is probably overkill for this particular application, and at least from my perspective adding flashback is just "guilding the lilly" of overkill.

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  12. #12
    Join Date
    Jul 2009
    Posts
    168
    Thanks for reply Pat. So in the example you gave, we wont have a unique primary key field auto-generated by database server? Also as you mentioned, no need for history tables which is cool but i will need to link up the effective_date field to the vehicle or driver table. Is the effective date then a sort of primary field in this case?

  13. #13
    Join Date
    Jun 2003
    Location
    Toronto, Canada
    Posts
    5,516
    Quote Originally Posted by shammat View Post

    In Oracle 11 you can create a "FLASHBACK ARCHIVE" which gives you the ability query historical data without hassle and without any read overhead on the base data. Of course it comes with a hefty price tag
    In DB2 there's a feature called temporal tables that makes maintenance of historical data quite simple and pretty much transparent to the application. This feature is included in DB2 Express-C, which is free to use.

    db2 time travel queries - Google Search
    ---
    "It does not work" is not a valid problem statement.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •