If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > DB2 > insert slow in DPF environment

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 12-26-10, 01:12
oracle10gsingh oracle10gsingh is offline
Registered User
 
Join Date: Nov 2007
Posts: 72
insert slow in DPF environment

Hi ,

i created a table in tablespace spread across 24 partitions with range partitioning done only for the month of january in that tablespacce i am getting a speed of 1000 records / per minute .before that i created the same table on node num 0 that time i was getting speed of 1 million records/per minute ..i am inserting 400 million records for 1st january 2nd and 3rd.

i am calling a stored procedure to insert records .Please tell me how i can speed up the process..here is my table ddl .

Create table Maxrecon.Test_load (subscriber_arrangement_id Bigint, Start_date date, Event_id smallint, COl1 smallint, Col2 Smallint) DISTRIBUTE BY HASH("SUBSCRIBER_ARRANGEMENT_ID") PARTITION BY RANGE("START_DATE") (PART "JANUARY10" STARTING('2010-01-01') ENDING('2010-01-31') IN "MAXTEMP") ORGANIZE BY ( START_DATE,EVENT_ID)

Regards,
Saurabh Narula
Reply With Quote
  #2 (permalink)  
Old 12-26-10, 08:55
Marcus_A Marcus_A is offline
Registered User
 
Join Date: May 2003
Location: USA
Posts: 5,198
The main problem is probably that you are not using the Load Command, which is important with DPF to get maximum performance (parallel bulk loads). If you could do a load from cursor in the stored procedure, I would try that, otherwise probably try to format a flat file to be loaded using the Load command.

I notice that you are using the following types of partitioning:
  • DPF (DISTRIBUTE BY HASH)
  • Table Partitioning (PARTITION BY RANGE)
  • MDC (ORGANIZE BY)

I can't see your other Keys and Indexes, but if you want partitioned indexes, the Table Partitioning column must be included in all unique indexes (PK's, etc).

Do you really need the MDC? My gut feel is that you have way too much data for each date to make that part of a MDC, and since you are already doing Table Partitioning by Date, what exactly is the point of the MDC?.

Instead, I would create a PK of (Start_date, subscriber_arrangement_id) in that order (you need start_date in the PK to have partitioned indexes). Then create another index on Event_id (if it has sufficient cardinality), and a third (non-unique) index on subscriber_arrangement_id (cannot be unique if you want partitioned indexes). Forget about MDC for this table.
__________________
M. A. Feldman
IBM Certified DBA on DB2 for Linux, UNIX, and Windows
IBM Certified DBA on DB2 for z/OS and OS/390
Reply With Quote
  #3 (permalink)  
Old 12-26-10, 19:42
db2girl db2girl is offline
∞∞∞∞∞∞
 
Join Date: Aug 2008
Location: Toronto, Canada
Posts: 1,816
If you can't use load, maybe import with buffered inserts? Take a look at the insert related slides at the end of this presentation:
http://public.dhe.ibm.com/software/d...203-slides.pdf
Reply With Quote
  #4 (permalink)  
Old 12-27-10, 04:11
dr_te_z dr_te_z is offline
Registered User
 
Join Date: Jan 2009
Location: Zoetermeer, Holland
Posts: 555
Quote:
Originally Posted by BELLO4KA View Post
If you can't use load, maybe import with buffered inserts? Take a look at the insert related slides at the end of this presentation:
http://public.dhe.ibm.com/software/d...203-slides.pdf
Nice oopsie in that presentation on slide #22.
Like Canada belgium (Belgium - Wikipedia, the free encyclopedia) does not have its own language. Unlike Canada this is a very sensible and painful issue causing a lot of political trouble for decades now So IBM: I would block that presentation for belguim IP numbers
Reply With Quote
  #5 (permalink)  
Old 12-27-10, 13:55
JAYANTA_DATTA JAYANTA_DATTA is offline
Registered User
 
Join Date: Oct 2004
Location: DELHI INDIA
Posts: 336
Saurav,
you are giving a comparison figure between the INSERTs on the table on Node-0 with inserting into the same table on multi-node. Did you try to see similar difference in the throughput rate while you were trying to LOAD a data file into the same table (partitioned and then to the non-partitioned one) ? If you observe that even LOAD command is also running too slow on the partitioned table compared to the non-partitioned one, it means first the bottleneck in your DPF set-up (related to the table) needs to be fixed. Its possible that the single partition table is on a different tablespace and bufferpool where only that table exists, whereas the partitioned table is on a Bufferpool which is being used by multiple tables at the sametime etc. Other stuffs which you need to check is whether your data is evenly distributed across all partition, or its skewed on a partitcular node due to wrong selection of the partitioning key, similarly check for page-cleaners and related values like FCM, container distribution across file-system etc. Let us know with your input so that others can help you on this with further suggestion after understanding the real situation here.
__________________

Jayanta Datta
DB2 UDB DBA
IBM India, Global Delivery
New Delhi

Last edited by JAYANTA_DATTA; 12-27-10 at 13:58. Reason: typo
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On