Results 1 to 6 of 6

Thread: JOIN question:

  1. #1
    Join Date
    Feb 2005
    Posts
    118

    Unanswered: JOIN question:

    Hi All !

    I know that DBA's supporting DB2 for MVS used to discourage a JOIN involving a JOIN of more than 3 tables.
    In my view, over the years, the DB2 optimizer has come a long way, and is getting more and more efficient in handling join involving several tables.
    My question to you all is,
    Does the restriction on number of tables in a JOIN has to do with the legacy of DB2 ?
    Is there any claim that reducing the number of tables in a JOIN can improve performance ? In order words split a 8 table join into three joins and use a application logic to do what db2 would do for you had you used a 8 table join.
    Whether to use many tables in a JOIN or not ? is that a question that depends on if it is DB2 for z/OS versus DB2 for AIX (or any other non mainframe platform).

    Thanks
    Anil

  2. #2
    Join Date
    Jan 2003
    Posts
    1,605
    Hi,
    I have never heard that are somekind of rule: "don't use more than 3 joins" for get performance benefit.

    I use db2 for Linux/Unix/Windows so I don't know anything about MVS. I hope this is not somekind of MVS limitation.

    This has to be an old rule! I have also heard that joins are going bad from programmer in my company (very very very old rule!!!), I just told him to send me couple of SQLs that are running long time. When I have looked at them I just saw lack of SQL writting knowledge, bad talbe design, etc. Tables had no primary keys, no indexes of any kind, badly written SQL, etc. I looked at explain and in 5 minutes of my research the result: create primary key, create index, create foreign key, etc, rewrite SQL and what has happend: sql execution time drop down from 30 minutes to 3 secounds!

    General rule of trumb is: "let database do the whole transformation" then look at indexes, runstats, reorg, etc and then see the explain. In production environment I have joins of 15 tables and all of them are executed within 3 seconds.

    But another rule: "join can be a perormance problem if the end result - and the intermediate results are having a very big number of recourds. So when the result has only 1 to 10 rows than no joins should be avoided.

    I have also seen that some very complex SQLs run for ever, but when appropiate runtstats have been applied (runtats with distribution and detailed indexes all) then sql runed for 5 seconds.

    You asked very general question, so my answer is very general: do not omit joins to number of joins. You need to look at each SQL to see if explain is going well or not!

    Hope this helps,
    Gofaty

  3. #3
    Join Date
    Oct 2005
    Posts
    109
    Anil,

    this got nothing to do with z/OS or LUW, not even DBMS dependend.

    The point is: the fewer joins you have, the better the optimizer will be.

    Generally: It will usually be better to have one large statement involving 8 joins instead of three statements connected with logic; if you consider the I/O and communication, nothing can be gained here, and the database will do internal compare operations or calculations always better than application can do.

    In previous times it was a sign of bad DB design, if more than, lets say 6 joins where involved. (I also suggested to check the design on LUW if that was the case).

    Today things changed so far, that often your design is based on your java application, in the 'worst' case (from db point of view) it is done by a persisitence layer. In the ER design no manual work is invested. So worst case per java class one table - ending up with 1500 tables instead of 100 actually needed.
    Then: any access often results in joins of 15 or more tables.
    If the SQL is lousy generated: the joins will be stated as left outer joins even with following where conditions (which implies inner join) - but with so many tables no optimizer in the world has a chance to always optimize that correctly.

    (E.G. java programmers let the persistence layer do design(they think objective and java), I have seen cases they work only with "select * from..." - doing any logic, even where conditions, themselves. Then I ask about how useful it is to use a database at all...
    Later they run into performance problems and start joining - resulting in many-table-joins since the schema is just given that way.)

    So of course: performance-wise: the fewer tables involved in a join the better, but this is generally overruled by: the fewer statements I have the better.

    It is a trade off: what you invest in your DB design and mapping strategy will finally pay off in performance.

    Well - not only performance. Your admin will be happy to handle 100 tables, but with 1000+ tables administration can get really nasty. Usually default settings for tablespaces and stuff will have to do then.

    If you don't have a performance problem - then don't care. But if you have - to change ER design strategy later in a project is very painful.
    I have seen that too often ...

    I guess I talked a lot since we had many discussions on the topic and it is kind of sensitive for me
    Finally programmers and architects always started considering that.
    Juliane

  4. #4
    Join Date
    Oct 2005
    Posts
    109
    I should have bolded that:

    So of course: performance-wise: the fewer tables involved in a join the better, but this is overruled by: the fewer statements I have the better.
    Juliane

  5. #5
    Join Date
    Mar 2004
    Location
    Toronto, ON, Canada
    Posts
    513
    Quote Originally Posted by juliane26
    I should have bolded that:

    So of course: performance-wise: the fewer tables involved in a join the better, but this is overruled by: the fewer statements I have the better.
    That's a pretty good rule of thumb for OLTP.

    Of course in a data warehouse you can de-normalize some tables for performance reasons and get rid of join conditions. Or use a materialized query table to pre-implement some of the joins.
    --
    Jonathan Petruk
    DB2 Database Consultant

  6. #6
    Join Date
    Nov 2005
    Location
    Toronto
    Posts
    65
    Usually that is true. However you still need to get access plan to analysis. That's totally depends on your system.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •