If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > DB2 > Partition Key Design

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 10-27-03, 17:45
koganti koganti is offline
Registered User
 
Join Date: Jul 2002
Location: ATL
Posts: 170
Partition Key Design

What are the procedures to be followed when you design a partition key in EEE.
Is there any document for the design procedures.


Thanks
Prasad
Reply With Quote
  #2 (permalink)  
Old 10-27-03, 20:54
Marcus_A Marcus_A is offline
Registered User
 
Join Date: May 2003
Location: USA
Posts: 5,198
The "Administration Guide: Planning" has a discussion of this topic.

It is a good idea to force related data that is frequently joined to be on the same partition.

For example, on an order table, the primary key would be order_mumber. This would make a good partitioning key for this table.

The order_item table has a primary key of (order_number, item_number). But the partitioning key of this table should be order_number (not the full primary key). This forces the related rows of the order table and the order_item table to be on the same partition, which allows them to be joined efficiently (without sending data to another partition to be joined).

The fact that the primary key is not always the best partitioning key (although sometimes it is) is the hardest concept for people to understand, and by far the biggest mistake made in parallel database implementation.
Reply With Quote
  #3 (permalink)  
Old 10-28-03, 10:21
koganti koganti is offline
Registered User
 
Join Date: Jul 2002
Location: ATL
Posts: 170
Partition Key

Thank You Marcus.
Administration Guide really does not provide much details in terms of picking up a Partition key.
Iam also looking at the performance issues when picking up a partion key.
I was wondering what situations you might pick up primary key as partition key and in what situations you will pick up non primary key and what situations you will pick up multiple columns for a partion key and what are its advantages and disadvantages.
Does the partition key design be different in OLAP and OLTP environments


Thanks
Prasad
Reply With Quote
  #4 (permalink)  
Old 10-28-03, 10:39
sathyaram_s sathyaram_s is offline
Super Moderator
 
Join Date: Aug 2001
Location: UK
Posts: 4,534
Re: Partition Key

As Marcus has explained, the primary consideration when choosing a partioning key should be to avoid/minimize data shipping ...

There are no hard and fast rules for choosing the partitioning key ..

For example, if for two large tables , say 90%, of theie workload join on 3 columns , make these three columns as the partitioning key , irrespective of what the primary key column(s) is ....

From my point of view, OLTP or OLAP, the considerations remain the same ... Ask yourself which queries are critical,important and design your partitioning key based on this ..


HTH

Sathyaram

Quote:
Originally posted by koganti
Thank You Marcus.
Administration Guide really does not provide much details in terms of picking up a Partition key.
Iam also looking at the performance issues when picking up a partion key.
I was wondering what situations you might pick up primary key as partition key and in what situations you will pick up non primary key and what situations you will pick up multiple columns for a partion key and what are its advantages and disadvantages.
Does the partition key design be different in OLAP and OLTP environments


Thanks
Prasad
__________________
Visit the new-look IDUG Website , register to gain access to the excellent content.
Reply With Quote
  #5 (permalink)  
Old 10-28-03, 11:00
Marcus_A Marcus_A is offline
Registered User
 
Join Date: May 2003
Location: USA
Posts: 5,198
If you have an OLAP environment, you should consider using DB2 ESE V8 with the Partitioning Option to have the option of using MDC (Multi Dimensional Clustering) that is available in version 8. Since you mentioned EEE in your original post, I was assuming you were using DB2 EEE version 7.

As mentioned in the “DB2 Administration Guide: Planning” (Version 7) page 104, one of the main factors is to use a column with high cardinality that will distribute the data evenly across the partitions when that column is hashed. As previously mentioned, getting frequently joined tables on the same partition (for related rows) is also important. These are the two main considerations.

But to answer your question requires looking at the entire database design in total and not just any one table. But if you happen to define a primary key through declared RI, the partitioning index must be a column that is contained in the primary (but not necessarily the whole primary key).

If I had to pick one rule of thumb, without seeing the database design, it would be to use one column of the primary key (whether or not RI is explicitly defined), and that column would be the one with the most unique values (usually the first column if the primary key has multiple columns).

Keep in mind that other indexes can be created on the table. Also there are decisions about clustering (or MDC with DB2 version 8) that need to be decided.
Reply With Quote
  #6 (permalink)  
Old 10-28-03, 11:47
koganti koganti is offline
Registered User
 
Join Date: Jul 2002
Location: ATL
Posts: 170
Thanks A Lot for you all
That was valuble information.
Iam trying to understand how the data be distributed across the partions if the partition key has multiple columns.


Thanks
Prasad
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On