Results 1 to 7 of 7
  1. #1
    Join Date
    Nov 2004
    Posts
    7

    Unanswered: Comparison of tables

    Please let me know how can i compare two tables with same data in Oracle which has huge volume of data.
    My requirement is like this:
    I have two environment ..1)Oracle Source ---informatica ETL---Oracle target and 2) Oracle Source ---Datastage ETL---Oracle target. (OS is UNIX Solaris for both)...
    Once the data get generated into Oracle, i need to compare the table data to find out whether the data gerneated using the ETL tools are in Sync or not. The table will have huge volume of data.
    I tried with some DBcomparison tools, but it takes very long time for comparison.

    One thing i can do is, i can write the table data into a flat file and do the data comparison.
    The problem in this approach is, 1) the order of record may be different and so i need to the orderby based on index columns in the table itself before writing into flat file. i.e I need to make sure that the record order should be the same in both the flat files.
    2) I need to skip the columns that holds the time data for the comparison, otherwise all the records will show as mismatch.

    Please suggest me the ways to implement this or any new way (sql procedure/shell scripting/or any other method) to solve this issue.

    All your help is really appreicated.

    Thanks in Advance.

  2. #2
    Join Date
    Jun 2003
    Location
    West Palm Beach, FL
    Posts
    2,713

    Cool

    Use MINUS:
    Code:
    select col1, col2, col7, col9 from tableA
    MINUS
    select col1, col2, col7, col9 from tableB
    The person who says it can't be done should not interrupt the person doing it. -- Chinese proverb

  3. #3
    Join Date
    Aug 2003
    Location
    Where the Surf Meets the Turf @Del Mar, CA
    Posts
    7,776
    Provided Answers: 1
    It depends.
    Too broad a question with too few details provided.
    How often does this need to be done?
    How fast does the compare process need to be?
    How likely will differences occur?
    Do you only need to know a difference exists or do you need to know WHAT is different?
    What is supposed to happen when a difference is detected?
    How will you know which is "correct"?
    You can lead some folks to knowledge, but you can not make them think.
    The average person thinks he's above average!
    For most folks, they don't know, what they don't know.
    Good judgement comes from experience. Experience comes from bad judgement.

  4. #4
    Join Date
    Nov 2004
    Posts
    7
    Thanks for all your reply.

    How often does this need to be done?
    Ans:I'm involved in the conversion of informatica jobs to Datastage. Once the jobs has been developed, i need to run both and do the data comparison. Since the Source data is same for both, the data should accurately match between both the tables before moving the converted jobs to production environment.

    How fast does the compare process need to be?

    Ans: The process needs to be very very fast.(fastest way is preferable)(Any unix scripts can be used??)

    How likely will differences occur?
    Ans: There may or maynot be differences.

    Do you only need to know a difference exists or do you need to know WHAT is different?
    Ans: I need to find out, what is different

    What is supposed to happen when a difference is detected?
    Ans: The job which populates the table needs to be verifed and corrected

    How will you know which is "correct"?
    Ans: Based on the business rules

    Please let me know if you need more info.

    Thanks

  5. #5
    Join Date
    Aug 2003
    Location
    Where the Surf Meets the Turf @Del Mar, CA
    Posts
    7,776
    Provided Answers: 1
    It can be a challenge to detect & report differences be cause they can exist in 3 forms.
    1) Missing data in one source or the other
    2) "Extra" data in one source or the other
    3) Data is just different between the two sources.

    This can be accomplished by making a single ordered (PK) pass thru each data source.
    By comparing the PK's you can decide if you have extra or missing records (#1 & #2)
    By comparing the non-PK fields you can decide #3.
    Ignore records where PK & remaining fields match across the board.
    What is left over are the differences.
    You can lead some folks to knowledge, but you can not make them think.
    The average person thinks he's above average!
    For most folks, they don't know, what they don't know.
    Good judgement comes from experience. Experience comes from bad judgement.

  6. #6
    Join Date
    Nov 2002
    Location
    Desk, slightly south of keyboard
    Posts
    697
    Hi,

    Is the compare time critical or the load time critical?

    Consider...
    Code:
       create or replace trigger t1t after insert on t1 for each row begin
          update t1_t2_audit 
          set checksum = checksum + :new.total 
          where pk = :new.pk
          returning checksum into nck;
          if sql%rowcount=0 then 
             insert into t1_t2_audit (pk, checksum) values (t1:new, t1:total)
             returning checksum into nck;
          end if;
          if nck = 0 then 
             delete from t1_t2_audit where pk = :new.pk;
          end if;
       end;
    Now create another trigger doing the same on your other table (T2) except it subtracts the total or inserts negative total (total*-1).

    You've just added microseconds to each insert operation during load, but to determine if there is a difference....
    Code:
        select count(*) into something from t1_t2_audit;
        if temp > 0 then shout at something
    To see actual rows with differences....
    Code:
        select * from t1_t2_audit
    It depends where you want the performance to go. It has to go somewhere.

    Incidentally, the minus operation isn't reliable, unless you do....

    select * from t1 minus select * from t2
    union all
    select * from t2 minus select * from t1

    Hth
    Bill
    "No funny answers from me, just suggestions (tm)"
    Please don't email me directly with questions. I've probably just got home from the pub and cannot guarantee the sanity of my answers. In fact, I can't believe I actually made it home.

  7. #7
    Join Date
    Jun 2004
    Location
    Liverpool, NY USA
    Posts
    2,509
    I just ran across an oracle package that is designed to detect differences in two tables that are in different databases. Check out DBMS_RECTIFIER_DIFF. I don't know if this will help, but it looks on point.
    Bill
    You do not need a parachute to skydive. You only need a parachute to skydive twice.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •