From past few days we are getting a very bizarre problem with one of our batch job.
The details are-
We get a file with tracking data in it.
Our batch job processes the file and validates the data in file. First the data is loaded into worktables. Then it is validated against the LIVE tables.
One of our check is behaving in very strange manner. It is marking the records as erroneous when it is very clear that the data is valid and they shouldn’t be marked as erroneous.
The update which is working wrongly is-
UPDATE wrk_tms_new_nap_trk wtnnt
SET wtnnt.msg_cd = i_Error_Cd,
wtnnt.msg_typ_cd = i_Error_Cat,
wtnnt.upd_dt = SYSDATE,
wtnnt.upd_user_id = i_Prg_Nm,
wtnnt.chg_evnt_typ_cd = 'U'
WHERE NOT EXISTS
FROM memberships mem,
WHERE mem.cid = wtnnt.cid
AND mem.lyty_prg_cd = lp.lyty_prg_cd
AND mem.curr_mbrp_sts_cd = 'A'
AND lp.lyty_prg_typ_cd = 'F'
AND mem.curr_mbrp_sts_start_dt <= TRUNC(SYSDATE)
AND pls.partner_cd = wtnnt.partner_cd
AND pls.lyty_scheme_cd = mem.lyty_prg_cd
AND pls.start_dt <= TRUNC(SYSDATE)
AND pls.end_dt >= TRUNC(SYSDATE)
AND wtnnt.msg_cd IS NULL
AND wtnnt.prg_nm = i_Prg_Nm
AND wtnnt.bch_no = i_Bch_No
AND wtnnt.set_cd = i_Set_Cd ;
The data in Live tables memberships, partner_lyty_scheme & loyalty_programmes is valid. Still it passes the NOT EXISTS check and marks the records as erroneous.
This happens not with all the records in file. Say there are 10 records with similar data 7 fails and 3 succeed. The strange thing is when I reprocess the file even after 10 minutes without changing anything anywhere the previously error records does not fail and gets processed.
For investigation I had requested DBAs to setup some triggers to capture the data of all the involved tables. The trigger captures the data of erroneous records and as expected all the things are valid. The sub select for these records brings back the rows.
But somehow the ORACE at runtime decides that there are no rows returned from subselect. It passes the not exists check even though the row exists.
This misbehaviour of Oracle has no trend. Sometime no files in entire day are affected and at time 3-4 files in same day are affected. I tried to observe a date/time pattern but this just happens anyday anytime.
From what I can see you may have live tables that are changing during your loading by transactions which havent yet committed. So triggers wont pick this up as they will see the uncommitted data, the trigger wont record when the commit occurs and thus when the loading session will see the new data.
I would say the best thing to do if nothing else is altering the three tables in the subselect is to export the three tables in the subselect and the temp table wrk_tms_new_nap_trk before the next batch job. If the next run has an error import the four tables into your test database and run the update. Try duplicating the error its either a problem with your data or you've hit an Oracle internal bug (unlikely but it does happen). Check metalink for similar bugs.
Like I said, export the four tables to a test system and reproduce the error. Load the staging table and then run the update statement. If it doesnt update properly you can check your data, subquery etc in isolation from the rest of your loading process. You also then have a test case to send to Oracle if you still cant find the problem.