We have DB2 V8, running in Z/OS.
I need to create unloads of a large partitioned table - where each partition goes into a separate dataset. I also need to eliminate certain columns along the way, and also reformat the data.
I was accomplishing all of this quite accurately with a DSNTIAUL approach. I had a DSNTIAUL step for every partition (using a WHERE clause to isolate each partition). I also used a SQL SELECT statement combined with various SQL reformat functions such as COALESCE to only select the required columns and to do the reformatting. But the DSNTIAUL approach was going painfully slow when tested against large volumes of data. i.e. - it was taking between 20 minutes and 50 minutes to do each partition (approximately 4 million rows per partition, and 37 bytes worth of data, per row). Since I have 50 partitions, this can add up to a lot of time (even if I concurrently run several partition-specific steps).
I therefore want to explore using the DB2 UNLOAD utility (introduced in V7), as opposed to DSNTIAUL. It claims to be much quicker than DSNTIAUL. I know that I will lose much of the reformatting capabilities by using UNLOAD, but if it is that much quicker, I can always add a SYNCSORT reformatting step to do the reformatting.
I have been reading the various IBM manuals and REDBOOKS on the topic and have even run some initial promising tests - but I am running into some roadblocks. If anyone knows the answers to these questions, please advise.
(1) When using the 'LISTDEF' function to only select certain partitions, can that be combined with the 'FROM TABLE' clause. I need to use the 'FROM TABLE' so that I can just isolate the columns I need, as well as get rid of the leading 2 bytes using the HEADER NONE statement. Both the ability to select only certain columns and the use of the HEADER NONE statement appear to be subsets of the FROM TABLE clause. However, it appears that I cannot place the 'FROM TABLE' clause together with the LISTDEF function. Am I wrong about that?
(2) In the UNLOAD documentation, it presents ways to concurrently unload the partitions (using 'partition parallelism' ) but all of the examples appear to use the TEMPLATE approach to allocating the output datasets. I have successfully tested their examples, but I prefer not to use this approach, as I need to have more control and flexibility with the way I allocate the output datasets. This control and flexibility can only be achieved if I directly allocate these datasets in the JCL and not in the SYSIN TEMPLATE.
Is there a way to achieve the efficiencies of partition parallelism and at the same time, allocate the partition-specific datasets in the JCL and not in the SYSIN TEMPLATE ?
Thank-you for pointing out that UNLOAD TABLESPACE can be used with FROM TABLE. I was able to test this successfully with full production volumes of data. Even if I do not get partition parallelism, the UNLOAD utility performs very quickly and that speed makes up for whatever efficiencies I am losing. I find that I can unload each partition in about 30 seconds (about 4 million rows for eac partition). Furthermore, the SYNCSORT step that I wrote to sort and reformat the data also performs very quickly. So the process to unload all 50 partitions and reformat them should take less than 1 hour.