If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > DB2 > JOIN question:

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 11-15-05, 04:44
AnilKale AnilKale is offline
Registered User
 
Join Date: Feb 2005
Posts: 118
JOIN question:

Hi All !

I know that DBA's supporting DB2 for MVS used to discourage a JOIN involving a JOIN of more than 3 tables.
In my view, over the years, the DB2 optimizer has come a long way, and is getting more and more efficient in handling join involving several tables.
My question to you all is,
Does the restriction on number of tables in a JOIN has to do with the legacy of DB2 ?
Is there any claim that reducing the number of tables in a JOIN can improve performance ? In order words split a 8 table join into three joins and use a application logic to do what db2 would do for you had you used a 8 table join.
Whether to use many tables in a JOIN or not ? is that a question that depends on if it is DB2 for z/OS versus DB2 for AIX (or any other non mainframe platform).

Thanks
Anil
Reply With Quote
  #2 (permalink)  
Old 11-15-05, 05:31
grofaty grofaty is offline
Registered User
 
Join Date: Jan 2003
Posts: 1,570
Hi,
I have never heard that are somekind of rule: "don't use more than 3 joins" for get performance benefit.

I use db2 for Linux/Unix/Windows so I don't know anything about MVS. I hope this is not somekind of MVS limitation.

This has to be an old rule! I have also heard that joins are going bad from programmer in my company (very very very old rule!!!), I just told him to send me couple of SQLs that are running long time. When I have looked at them I just saw lack of SQL writting knowledge, bad talbe design, etc. Tables had no primary keys, no indexes of any kind, badly written SQL, etc. I looked at explain and in 5 minutes of my research the result: create primary key, create index, create foreign key, etc, rewrite SQL and what has happend: sql execution time drop down from 30 minutes to 3 secounds!

General rule of trumb is: "let database do the whole transformation" then look at indexes, runstats, reorg, etc and then see the explain. In production environment I have joins of 15 tables and all of them are executed within 3 seconds.

But another rule: "join can be a perormance problem if the end result - and the intermediate results are having a very big number of recourds. So when the result has only 1 to 10 rows than no joins should be avoided.

I have also seen that some very complex SQLs run for ever, but when appropiate runtstats have been applied (runtats with distribution and detailed indexes all) then sql runed for 5 seconds.

You asked very general question, so my answer is very general: do not omit joins to number of joins. You need to look at each SQL to see if explain is going well or not!

Hope this helps,
Gofaty
Reply With Quote
  #3 (permalink)  
Old 11-15-05, 05:32
juliane26 juliane26 is offline
Registered User
 
Join Date: Oct 2005
Posts: 109
Anil,

this got nothing to do with z/OS or LUW, not even DBMS dependend.

The point is: the fewer joins you have, the better the optimizer will be.

Generally: It will usually be better to have one large statement involving 8 joins instead of three statements connected with logic; if you consider the I/O and communication, nothing can be gained here, and the database will do internal compare operations or calculations always better than application can do.

In previous times it was a sign of bad DB design, if more than, lets say 6 joins where involved. (I also suggested to check the design on LUW if that was the case).

Today things changed so far, that often your design is based on your java application, in the 'worst' case (from db point of view) it is done by a persisitence layer. In the ER design no manual work is invested. So worst case per java class one table - ending up with 1500 tables instead of 100 actually needed.
Then: any access often results in joins of 15 or more tables.
If the SQL is lousy generated: the joins will be stated as left outer joins even with following where conditions (which implies inner join) - but with so many tables no optimizer in the world has a chance to always optimize that correctly.

(E.G. java programmers let the persistence layer do design(they think objective and java), I have seen cases they work only with "select * from..." - doing any logic, even where conditions, themselves. Then I ask about how useful it is to use a database at all...
Later they run into performance problems and start joining - resulting in many-table-joins since the schema is just given that way.)

So of course: performance-wise: the fewer tables involved in a join the better, but this is generally overruled by: the fewer statements I have the better.

It is a trade off: what you invest in your DB design and mapping strategy will finally pay off in performance.

Well - not only performance. Your admin will be happy to handle 100 tables, but with 1000+ tables administration can get really nasty. Usually default settings for tablespaces and stuff will have to do then.

If you don't have a performance problem - then don't care. But if you have - to change ER design strategy later in a project is very painful.
I have seen that too often ...

I guess I talked a lot since we had many discussions on the topic and it is kind of sensitive for me
Finally programmers and architects always started considering that.
__________________
Juliane
Reply With Quote
  #4 (permalink)  
Old 11-15-05, 05:35
juliane26 juliane26 is offline
Registered User
 
Join Date: Oct 2005
Posts: 109
I should have bolded that:

So of course: performance-wise: the fewer tables involved in a join the better, but this is overruled by: the fewer statements I have the better.
__________________
Juliane
Reply With Quote
  #5 (permalink)  
Old 11-15-05, 07:53
J Petruk J Petruk is offline
Registered User
 
Join Date: Mar 2004
Location: Toronto, ON, Canada
Posts: 513
Quote:
Originally Posted by juliane26
I should have bolded that:

So of course: performance-wise: the fewer tables involved in a join the better, but this is overruled by: the fewer statements I have the better.
That's a pretty good rule of thumb for OLTP.

Of course in a data warehouse you can de-normalize some tables for performance reasons and get rid of join conditions. Or use a materialized query table to pre-implement some of the joins.
__________________
--
Jonathan Petruk
DB2 Database Consultant
Reply With Quote
  #6 (permalink)  
Old 11-15-05, 08:46
wangzhonnew wangzhonnew is offline
Registered User
 
Join Date: Nov 2005
Location: Toronto
Posts: 65
Usually that is true. However you still need to get access plan to analysis. That's totally depends on your system.
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On