Results 1 to 6 of 6
  1. #1
    Join Date
    Mar 2004
    Posts
    30

    Unanswered: Refential Integrity on Batch Load Basis in Oracle 10G

    Hi, Im looking to maintain referntial integrity within an Oracle 10G database, which is loaded on a batch basis. The diagram I have attached describes it best :

    - Say I have a parent and a child table, with existing primary/foreign keys respectively A,B,C and A,C ( child foreign key must exist in parent ) and refential integrity must be mantained
    - Say I attempt to insert new rows into both tables on a batch basis ( parent new rows = A,D,E : child new rows = D,E,F ) from corresponding "staging" tables, which are of exactly the same format to the published tables, but just contain data "in-transit". Obviously I would run the parent batch load before the child to ensure refential integrity as much as possible.
    - So what Im looking to occur, is that the parent rows D,E are loaded successfully, but the A row will not ( duplicate primary key ). Normally an INSERT SELECT between the corresponding staging and published tables would fail due to the duplicate primary key, and nothing would be inserted at all, but I want to just catch that exception, and store it somewhere ( in an exception table ), but still have the successful rows loaded.
    - Additionally, when loading the child table, the rows D,E are loaded successfully as their parents were just loaded, but F would fail due to a foreign key violation ( the non-existance of a F row in the parent table ). I would like the same thing to occur as above....

    So can anyone help me out with how I can achieve this using Oracle features, maybe even new ones in 10G. Do i have to define the tables to capture constraint errors, or write the INSERT SELECT sql differently, and how can i store exceptions off to the side somewhere, or alternatviely, can I update that violating staging row to indicate failure ?

    Ive seen the ENABLE PRIMARY KEY EXCEPTIONS INTO clause, but the whole INSERT SELECT statement still fails if a single row violates, and plus, I cant see any corresponding clause for foreign keys exceptions ? Can anyone please assist ? I would like to do this at the database level rather than application level if possible. Does anyone also have any opinions on the overhead involved in doing it at the DB level also ? Thanks. Adam
    Attached Thumbnails Attached Thumbnails dbforums-pic.jpg  

  2. #2
    Join Date
    Apr 2003
    Location
    Greenville, SC (USA)
    Posts
    1,155
    The "batch load" ... How about making it into a pl/sql routine using cursors
    then checking each insert with "exception when dup_val_on_index then
    write to exception table" or something like that ...

    cursor get_parent is select ... from parent_table;

    open get_parent
    fetch get_parent into ....;
    open get_parent;
    exit when get_parent%notfound;
    begin
    insert into anothertable values (....);
    exception when others then
    insert into exception table ...
    end;
    fetch get_parent into ....;

    end loop;
    close get_parent;
    open get_child ...

    another loop
    ...
    ...
    ...

    Not up on all new features of 10g yet...
    Gregg

  3. #3
    Join Date
    Jun 2004
    Location
    Liverpool, NY USA
    Posts
    2,509
    How about something like

    insert into exception_table
    select *
    from my_load_table a
    where exists (select 'x'
    from master_table b
    where b.primary_key = a.primary_key);

    Then

    insert into master_table
    select *
    from my_load_table a
    where not exists (select 'x'
    from master_table b
    where b.primary_key = a.primary_key);
    Bill
    You do not need a parachute to skydive. You only need a parachute to skydive twice.

  4. #4
    Join Date
    Mar 2004
    Posts
    30

    Response

    Thanks for your reply guys...

    1. Gregg - Way too many rows to be inserted at a time to consider using PL/SQL looping, im talking 50,000 on average a day per run ( with 800 runs a day ).
    2. Bill - I think the latency between the two statements might cause problems. I guess your solution may work but it enforces at the application level only, I was really hoping to do it at the DB level if possible.

    Just got me thinking, looking at the volumes of data above, maybe I would be better off disabling RI before loading ( so it doesnt have to check each row being inserted one at a time : kinda the same as greggs solution really but performed internally in Oracle ), loading the data, turning RI back on and then catching the exceptions ? Does anyone have any knowledge in how to do this successfully...

    I guess my end game is to ensure RI on a large scale database but also be wary of performance in getting the data in... Obviously there has to be a trade off somewhere... Thanks, Adam

  5. #5
    Join Date
    Aug 2003
    Location
    Where the Surf Meets the Turf @Del Mar, CA
    Posts
    7,776
    Provided Answers: 1
    IIRC, SQLLDR does what you want to do by using the BAD= qualifier.
    You can lead some folks to knowledge, but you can not make them think.
    The average person thinks he's above average!
    For most folks, they don't know, what they don't know.
    Good judgement comes from experience. Experience comes from bad judgement.

  6. #6
    Join Date
    Mar 2004
    Posts
    30
    very good solutions i see it also rejects records that fail RI, as well as records that are invalid ( non numeric, wrong layout, etc. ) although a few outstanding things which im not sure it will address....

    1 ) it obviously only works when loading files ( SQLLdr ), i really wanted it for when doing INSERT SELECTS from a staging table to a published table ( both tables are of same format. i guess i can compromise on the design to check RI while I am loading into Oracle Staging from files rather than just before "publishing" whislt already in Oracle
    2 ) although the bad file contains all rejected records, it obviously doesnt say why the were rejects ( key violation, invalid record format ), and the only place that is defined is in the log file, which is not really conducive to loading the reason codes into a table anywhere...
    3 ) i want to be able to reprocess these failed records ( those that failed RI that is ) at a later stage because the missing record might be there then, as per above it would be good to have the offending record in an Oracle Staging area ( that doesnt have RI enforced ), to report on it and for ease of rerunning, and then try to publish again to the RI enforced Published Area...

    Thanks for your suggestion, do you have any more ideas on the matter?
    Adam

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •