Hello,
I am running a multithreaded application against a DB2 database - and I am trying to breakdown the total data retrieved by the query below into smaller subsets that each of the threads in the application could process independently -
Select
X.C1
from X left outer join Y
on
X.Ck = Y.Ck*
Ck - is the distribution key for both X and Y - they are collocated
(This query is representative of the actual workload I am trying to optimize)
The first query I wrote to split this workload down into 8 smaller subsets (as there are 8 partitions on our test DB server) was -
Partition-1
Select
X.C1
from X left outer join Y
on
X.Ck = Y.Ck
where mod(X.Ck,8) = 0
Partition-2
Select
X.C1
from X left outer join Y
on
X.Ck = Y.Ck
where mod(X.Ck,8) = 1
and so on for other partitions
however, I am told that this is running 8 joins - one for each thread - ON ALL PARTITIONS and then the mod predicate is being applied - that defeats the purpose of breaking the workload down
So I am considering using the nodenumber function as an alternative to the mod function -
obvious question - Would nodenumber function ensure the query issued from each partition be restricted to a single node?
As always, thanks for your help!!!!
- G