Results 1 to 10 of 10
  1. #1
    Join Date
    Mar 2004
    Posts
    370

    Question Unanswered: Implementing a version control mechanism

    Hi everybody,

    I need to implement a versioning mechanism in my OLTP system to enable holding all history of many types of contents (Docs,Refers,Attachments,...).

    I conclude that there are 3 main ways as below(meybe more you can tell me):

    1. As our legacy system: Insert a complete copy of modified content in a seperate table together with some timestamp info. This way any simple update in content leads to an additional insert too.

    2. Use some diff algorithm to produce undo steps: b/c many of our content are text we could do this easily. This looks like to Oracle's redo logs, IMO.

    3.Using DBMS specific logs: (Note: I am not sure this is a feasible solution!) I do not save any history data. I will use something like oracle's log miner or SQL Server's transaction log facilities (maybe there is some native API to deal with too) and will generate required history when necessary.

    The most important thing is that our customers need to search and generate report on these content history with a reasonable performance.

    What is the best solution in your opinion? Your comments are appreciated, friends.

    -Thanks in advance

  2. #2
    Join Date
    Dec 2003
    Location
    Oklahoma, USA
    Posts
    354
    It seems to me that your 2nd and 3rd ideas are the same, using LogMiner to rebuild transactions and retrieve data... so you really have two solutions.

    From what I have read (meaning, I've not tried this) I understand that using LogMiner will be rather slow as Oracle has to search through archive logs to rebuild your data to the point in time you desire. As you stated that is something your customers will be using, I wouldn't recommend a solution that is slow... unless you like getting complaint calls, etc.

    I would propose two solutions:

    1) Write some triggers for tables that contain data you want to keep versions of previous data. An update trigger should do the trick, just create a copy of the previous data somewhere else before saving the new data. This would be a good "long-term" solution as it would keep historic data as long as you need it. Then, when the customer wanted a historical report, just query the history tables.

    2) If there is not a need to keep data copies forever-and-ever, you can use a feature (assuming Oracle 9i+) called flashback query. Flashback query allows you to requery your tables based on a point-in-time. However, the amount of time you are able to "go back in time" depends on disk space... Reports based on flashback essentially are SELECT * FROM table AS OF TIMESTAMP...

    I've used flashback query before to retrieve data that was accidentally erased, but I think it could be used in your situation as noted above.

    Hope this gives you some ideas.
    JoeB
    save disk space, use smaller fonts

  3. #3
    Join Date
    Aug 2004
    Location
    France
    Posts
    754
    Quote Originally Posted by joebednarz
    However, the amount of time you are able to "go back in time" depends on disk space...
    Right, Flashback queries use the UNDO tablespace to read old data, so once the desired state is no more in the UNDO, you will get a ORA-1555 SNAPSHOT TOO OLD error.

    Quote Originally Posted by joebednarz
    Reports based on flashback essentially are SELECT * FROM table AS OF TIMESTAMP...
    Or SELECT * FROM table AS OF SCN ... I've just used this feature and I have been quite disappointed by the "AS OF TIMESTAMP" use because the timestamp "state" is only precise to the second, meaning that if you have 5 modifications on the same row within the same second, you won't be able to access the five different states with a "AS OF TIMESTAMP" flashback query : you will need a "AS OF SCN" (that you could have gotten thanks to the ORA_ROWSCN pseudo-column for example).

    HTH & Regards,

    rbaraer
    ORA-000TK : No bind variable detected... Shared Pool Alert code 5 - Nuclear query ready .

  4. #4
    Join Date
    Mar 2004
    Posts
    370
    Thanks for your help friends,
    I've used Flashback queries before but just for administration pupose, not as a mechanism inside our applications.
    Unfortunately this feature can't be applied because our application will run on Oracle, SQL Server and MySQL and AFAIK there is no equivalent feature on latter ones.
    My first solution and a variant that joebednarz has mentioned is OK but it will cause a lot of work for system: insert a whole copy of data into another table and keep it safe, doesn't appear so scalable for me (note that some of these contents are i.e. 1 GB in size).
    Anyway thanks for you help and time

  5. #5
    Join Date
    Aug 2004
    Location
    France
    Posts
    754
    We have the same problematic too, but with less volume...

    For some data we want to have a copy of it just after a user modification, and keep this copy for auditing purposes... This concerns only 4 tables and a dozen of actions, so there is an applicative insert for each of these actions within the corresponding stored procedure : that is we update the record and then copy it to the historical tables. No trigger because this is not systematic and because I hate triggers .

    Now concerning Flashback queries, we could not have used them for the above, but we are using them for optimistic locking between users :

    - A user inserts a new row or selects an existing row and is sent its ORA_ROWSCN (the SCN corresponding to the last update on that row), or more precisely GREATEST(ORA_ROWSCN, TIMESTAMP_TO_SCN(SYSTIMESTAMP)) so that if the last update's SCN is no more in the UNDO, the following will work...
    - When the user wants to modify this row, he sends the SCN that was sent to him before and the stored procedure does the following : selects the current state of this row and compare it to its state "AS OF" the SCN sent by the client (that is why we don't want this SCN to be too old). If they are the same, then the user had up-to-date data on the screen concerning this row, so the update is performed and the new ORA_ROWSCN for this row is returned. If however the data are different, an error is sent back, asking him to re-select this row.

    This works beautifully, after some "tuning". At the beginning I was using "AS OF TIMESTAMP" everywhere but as I said in my prior post, it was not precise enough... So I used ORA_ROWSCN and "AS OF SCN"... Then realized that if the state corresponding to ORA_ROWSCN was no more in the UNDO, every update attempt would fail telling the user he was not up-to-date, hence the GREATEST(ORA_ROWSCN, TIMESTAMP_TO_SCN(SYSTIMESTAMP)) in the Select, so that if the row has just been updated within the current second, ORA_ROWSCN is returned, else TIMESTAMP_TO_SCN(SYSTIMESTAMP) is returned : anyway, what is sent back is the SCN as of which the client will be up-to-date.

    Thanks to this scenario, we could use optimistic locking without having to add a timestamp column in each table (that we would have had to update each time) or having the client send all the data he has on the screen before each update so as to compare to what is in the db.

    I LIKE this .

    HTH & Regards,

    rbaraer
    ORA-000TK : No bind variable detected... Shared Pool Alert code 5 - Nuclear query ready .

  6. #6
    Join Date
    Mar 2004
    Posts
    370
    Also I found there is a built in mechanism in Oracle called "workspace management" that is exactly for this purpose. Unfortunately we could not use it b/c of portability problems I mentioned before.
    If you are interested search asktom for this keywords.

  7. #7
    Join Date
    May 2004
    Location
    Dominican Republic
    Posts
    721
    Quote Originally Posted by RBARAER
    ..
    Or SELECT * FROM table AS OF SCN ... I've just used this feature and I have been quite disappointed by the "AS OF TIMESTAMP" use because the timestamp "state" is only precise to the second, meaning that if you have 5 modifications on the same row within the same second, you won't be able to access the five different states with a "AS OF TIMESTAMP" flashback query : you will need a "AS OF SCN" (that you could have gotten thanks to the ORA_ROWSCN pseudo-column for example).
    ....
    Well, on 10g, you have an added VERSIONS clause that extend the TIMESTAMP use, that you can use to find the "versions" of the rows you like on a given time, for example:
    Code:
    SQL>
    SQL> create table t as
      2  select 1 x from dual;
    
    Table created.
    
    SQL> create or replace procedure foo is
      2  begin
      3     update t
      4        set x = x * 5;
      5     commit;
      6  end;
      7  /
    
    Procedure created.
    
    SQL> declare
      2     n       int;
      3  begin
      4     dbms_job.submit( n, 'foo;' );
      5     dbms_job.submit( n, 'foo;' );
      6     dbms_job.submit( n, 'foo;' );
      7     commit;
      8  end;
      9  /
    
    PL/SQL procedure successfully completed.
    
    SQL> select * from t;
    
             X
    ----------
           125
    
    SQL> select VERSIONS_STARTTIME, VERSIONS_ENDTIME, VERSIONS_OPERATION, x
      2    from t VERSIONS BETWEEN TIMESTAMP MINVALUE AND MAXVALUE
      3   order by VERSIONS_STARTTIME
      4  /
    
    VERSIONS_STARTTIME             VERSIONS_ENDTIME               V          X
    ------------------------------ ------------------------------ - ----------
    17-OCT-06 09.20.32 AM                                         U        125
    17-OCT-06 09.20.32 AM          17-OCT-06 09.20.32 AM          U         25
    17-OCT-06 09.20.32 AM          17-OCT-06 09.20.32 AM          U          5
                                   17-OCT-06 09.20.32 AM                     1
    
    SQL>

  8. #8
    Join Date
    Aug 2004
    Location
    France
    Posts
    754
    Quote Originally Posted by JMartinez
    Well, on 10g, you have an added VERSIONS clause that extend the TIMESTAMP use, that you can use to find the "versions" of the rows you like on a given time
    Yes, but yet, this is not sufficient for our purpose :

    - suppose in second S there were 5 transactions T1, T2, T3, T4 and T5
    - you will see them all thanks to the VERSIONS clause, that is right
    - now suppose you just finished T3 and want to catch the current version so as to go back to that exact version later... from the example you give, if I understand correctly one would have to take the version with the highest end time just after T3... but the example below shows that this is not trustworthy

    Code:
    rbaraer@Ora10g> create table t as
    select 1 x from dual;  2
    
    Table created.
    
    rbaraer@Ora10g> create or replace procedure foo is
    begin
       update t
          set x = x * 5;
       commit;
    end;
    /  2    3    4    5    6    7
    
    Procedure created.
    
    rbaraer@Ora10g>
    declare
       n int;
    begin
       foo();
       foo();
       foo();
    end;
    /rbaraer@Ora10g>   2    3    4    5    6    7    8
    
    PL/SQL procedure successfully completed.
    
    rbaraer@Ora10g> alter session set nls_timestamp_format='YYYY/MM/DD HH24:MI:SS.FF9';
    
    Session altered.
    
    rbaraer@Ora10g>
    select VERSIONS_STARTTIME, VERSIONS_ENDTIME, VERSIONS_OPERATION, x
    from t VERSIONS BETWEEN TIMESTAMP MINVALUE AND MAXVALUE
    order by VERSIONS_STARTTIME;rbaraer@Ora10g>   2    3
    
    VERSIONS_STARTTIME             VERSIONS_ENDTIME               V          X
    ------------------------------ ------------------------------ - ----------
    2006/10/18 07:15:19.000000000                                 U        125
    2006/10/18 07:15:19.000000000  2006/10/18 07:15:19.000000000  U         25
                                   2006/10/18 07:15:19.000000000             5
    
    rbaraer@Ora10g>
    Here you could think it might work, but notice 2 things :
    - the fractional second part is 000000000 for every rows ! How would you order them correctly ?
    - they seem to be ordered, but if they were actually ordered by their transaction time, the order would be opposite to this one : here we would have a DESC "order by", whereas the default is ASC

    Now let's ORDER BY VERSIONS_ENDTIME :
    Code:
    rbaraer@Ora10g> select VERSIONS_STARTTIME, VERSIONS_ENDTIME, VERSIONS_OPERATION, x
    from t VERSIONS BETWEEN TIMESTAMP MINVALUE AND MAXVALUE
    order by VERSIONS_ENDTIME;  2    3
    
    VERSIONS_STARTTIME             VERSIONS_ENDTIME               V          X
    ------------------------------ ------------------------------ - ----------
    2006/10/18 07:15:19.000000000  2006/10/18 07:15:19.000000000  U         25
                                   2006/10/18 07:15:19.000000000             5
    2006/10/18 07:15:19.000000000                                 U        125
    
    rbaraer@Ora10g>
    Here you can see that obviously, one information is missing in order to get the last transaction.

    You did not convince me : for me SCN is the only way if you can have more than one transaction in a second (which is quite frequent, isn't it ? ).

    Regards,

    rbaraer

    PS : I realised there is a little flaw in the mechanism I exposed in my previous post : the "so the update is performed and the new ORA_ROWSCN for this row is returned" part is not completely bulletproof because once the COMMIT is performed, the SELECT FOR UPDATE lock is released, allowing another transaction that was waiting to commit to do so. So when I select the ORA_ROWSCN I am not sure it is the one just after "my" transaction or after mine and the other one... So what you told me gave me an idea : while I still only trust SCN, I could use VERSIONS_STARTSCN and VERSIONS_ENDSCN to be sure that the SCN sent back to the client is really the one just after HIS transaction. I would SELECT FOR UPDATE the row to be updated, get the current SCN c_scn, update, commit, and then SELECT the first VERSIONS_ENDSCN greater than c_scn for that row, which I would send back to the client. This would REALLY work beautifully .
    ORA-000TK : No bind variable detected... Shared Pool Alert code 5 - Nuclear query ready .

  9. #9
    Join Date
    May 2004
    Location
    Dominican Republic
    Posts
    721
    Not that I intented the example to give a full, 100% working test case, I was just exposing the idea. Of course if you really really want transactions details to the granule, you would order them by VERSIONS_ENDTIME and then by VERSIONS_STARTSCN/ENDSCN.

    I believe what I said is still held true: that you can capture different transactions that happened on the same time.

  10. #10
    Join Date
    Aug 2004
    Location
    France
    Posts
    754
    Quote Originally Posted by JMartinez
    I believe what I said is still held true: that you can capture different transactions that happened on the same time.
    Yes you are right. It's just that you quoted me when I was saying I had been disappointed by AS OF TIMESTAMP, so I thought that you wanted to say that with VERSIONS_STARTTIME and VERSIONS_ENDTIME the problem could be avoided, which is not true.

    So we agree .

    Best regards,

    rbaraer
    ORA-000TK : No bind variable detected... Shared Pool Alert code 5 - Nuclear query ready .

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •