We have a monthly scheduled job to retain the memberships of Airline. UNIX shell script calls C executable which further calls Oracle procedures. As the job has to process large volumes of data the memberships numbers are divided into 5 streams and 5 instances of shell scripts are executed.
However the problem occurs with the fifth stream. It takes unexpectedly long time. All 4 streams 1,2,3,4 finishes within 4-5 hours but the 5th stream goes upto 10 hours. All the streams have to process same number of records.
The main cursor from which the records are retrieved for processing is just the select from single table. (No Complex Joins). I have got the table analyzed from which membership numbers are retrieved.
What are the different things on which I should base my investigations?
If the fifth one takes twice as long, I would look at how many "sessions per user" for the oracle user are allowed. When you say processes, do they end up running DBS_JOBS? How many "job queue processes" are configured?
You do not need a parachute to skydive. You only need a parachute to skydive twice.
1. how are you splitting the data between the 5 processes, could it be something about the data distribution which is doing it.
2.Is the data being modified at the same time, for example if you have partitioned by date and the slow process is operating on the partition which is being written to.
3. What do your stored procedures do exactly?
4. What do your C programs do?
There are two jobs involved. The first job gets the total number of memberships to be processed. ( Say 100000) and divides it into 5 equal streams. ( Records divided in 5 chunks of 20000 each. The minimum membership no and maximum memberships no along with sequence number of each stream is stored in intermediate table.
Now the next main job runs with seq_no as parameter. Based on sequence number the minimum and maximum membership no is extracted and the job works on these records. (Range) (Each stream works on same number of records).
There is no partioning by date.
All the streams work simultaneously. The logic of job (Oracle Procedures) involves basic checks from few tables and insertion of new membership details.
Pro*C doesn’t involves any logic. Just used to call Oracle procedures.
I would say the best thing is to put timing code into your C and plsql code to see which particular stage is the bottleneck. Then you can tune it much easier rather than guessing where the problem resides.
I don't think this is the tuning problem. This is something to do with last stream hanging.
All the streams execute the same piece of code (Oracle procedure) . All the streams have similar number of records. Then how the 4 streams process within 4.5 hours and the fifth one goes upto 9-10 hours.
What I meant is that once you have timing code you know where the problem lies, at the moment we can but guess at what the issue is because there are a number of stages where it could be bottlenecking.