about my SP, i can't post it because it has about 930 rows... but i've uploaded it on geocities and if anyone wants to see it can donwload it by clicking
here - the SP it's quite simple to be explained:
the problem is we must approach a massive data load from a flat data file, without tie ourselves to a DB vendor... so we can't use the Load utility from IBM. i've developed an SP that recieve in input a CLOB (the flat data file) and parse it to insert data into DB. the SP must be SQL99 compilant.
it has about 150 constants (declare with default) that rappresent the map of the fields in each record of the data file and 150 variables to put the values parsed in. it gets 120 chars (size of one record) each time from the CLOB and parse this string to gather the values of fields. after parsed the record it inserts the data into the table, and so on, until the end of the CLOB is reached.
each record must be parsed with an average of 10 SUBSTR. my test uses a data file with about 6400 records (in a file of 750 KB) and - the test - DOES NOT perform inserts into tables - i'd like to test performance of the pure parsing first.
i don't think it's possible to optimize this kind of SP, because it's already quite simple
when i do the test on my local UDB, this TLB files appears, making my test end on a 57011 disk full error. the same thing seems to happen on an AIX system (but i've not access to that filesystem, so i don't know if the 57011 is provocated by a file similar to the TLB on windows).
some considerations:
1. the CLOB in input is declared with a size of 100 MB. i don't think this could be the problem, because if i give a file of 32 KB instead of 750 KB CLOB(100000000), CLOB(32000) and VARCHAR(32000) has the same behavior.
2. the code i uploaded contains a commit every 100 record parsed, but i've tried commit every 10 and 1000 records (even with only one commit at the end) and the results are very similar, few ms of difference. i don't think the TLB contains - b.e. - rollback infos because even with no inserts (only parsing) it's generated at the same size.
3. i've tried to propose a solution with a parsing in Java, but it wasn't accepted because we have critical needs of perfomance (final goal is to parse up to 80 GB in ONE hour, in the worse case).
4. i remind that we cannot use the DB2 utility Load because we can't tie up with a DB vendor: the procedure must be portable from DB2 to Oracle to SQL Server etc. and so compatible with SQL99 standards.