Results 1 to 4 of 4
  1. #1
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Provided Answers: 1

    Unanswered: Most efficient syntax for CONNECT BY....

    I am running queries on a recursive table using Oracle's CONNECT BY operator. I am looking for the most efficient way to implement it.

    This recursive query runs nearly instantaneously, and returns a single value (12) since the specified product_id has no descendants:
    Code:
    SELECT     product_id
    FROM       product_dim
    START WITH product_dim.product_id = 12
    CONNECT BY PRIOR product_dim.product_id = parent_prod_id;
    This query also runs nearly instantaneously:
    Code:
    SELECT COUNT(*)
    FROM   sale
    WHERE  product_id = 12;
    Now I want to merge the two queries, returning all the sale data matching the results of the recursive query. I would think this would also run very fast, as the recursive query only needs to be executed once, and only returns a single record, and the resulting aggregation should require little time.

    But this method of merging the two statements takes 70 seconds to process:
    Code:
    SELECT     COUNT(*)
    FROM       product_dim LEFT OUTER JOIN sale ON product_dim.product_id = sale.product_id
    START WITH product_dim.product_id = 12
    CONNECT BY PRIOR product_dim.product_id = parent_prod_id;

    This method runs significantly faster, but still requires 32 seconds to complete:
    Code:
    SELECT COUNT(*)
    FROM   sale
    WHERE  product_id IN(SELECT     product_dim.product_id
                         FROM       product_dim
                         START WITH product_dim.product_id = 12
                         CONNECT BY PRIOR product_dim.product_id = parent_prod_id);

    Finally, these two methods are the most efficient I have found, requiring 20 seconds each to complete:
    Code:
    SELECT COUNT(*)
    FROM   sale INNER JOIN   --IDList
                          (SELECT     product_dim.product_id
                           FROM       product_dim
                           START WITH product_dim.product_id = 12
                           CONNECT BY PRIOR product_dim.product_id = parent_prod_id) idlist ON sale.product_id =
                                                                                                           idlist.product_id;
    SELECT COUNT(*)
    FROM   sale,
           (SELECT     product_dim.product_id
            FROM       product_dim
            START WITH product_dim.product_id = 12
            CONNECT BY PRIOR product_dim.product_id = parent_prod_id) idlist
    WHERE  sale.product_id = idlist.product_id;
    My question is, am I missing a faster syntax? I am worried that when I run this with a subquery that returns thirty or forty descendants the entire statement will take an inordinate amount of time to complete. Is there a way to force Oracle to execute the statements separately, because in this case the combined time required to run them independently is orders of magnitude faster than running them as a single statement.
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

  2. #2
    Join Date
    Dec 2006
    Posts
    15
    Do you have an index on "product_id" column for table "sale" ?

    I think "theorically", the faster should be the one with the "product_id IN (SELECT product_dim.product_id".

    Is that Evan's Gambit ?
    Last edited by fischermx; 03-02-07 at 00:30.

  3. #3
    Join Date
    Nov 2006
    Location
    Indianapolis
    Posts
    90
    What do the explain plans look like?

    All your connects look the same to me - I wouldn't think that part of the plan would vary between your examples, you're always starting with a given product_id

  4. #4
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Provided Answers: 1
    Yes, the table is sufficiently indexed.

    Regarding the Explain Plans:

    Method #3 (the fastest) performs an hash join between "INDEX FAST FULL SCAN" and a view with an "INDEX UNIQUE SCAN". It is the INDEX FAST FULL SCAN that eats up almost all of the cost of the query.

    Method #2 performs all the actions of Method #3, but adds in a HASH JOIN SEMI which itself quadruples the cost of the query.

    Method #1 (the slowest) connects a FILTER task, a HASH JOIN, and a COUNT, each of which contains a nested loop.

    What is don't understand is why the engine does not just calculate the subquery (which only returns one record is runs lightning fast) and apply the result to the outerquery (which also runs lightning fast when the subquery result is hard-coded. Instead, the combined process is slower by several orders of magnitude. Is there a more efficient way to do this? I suppose I could load the results of the subquery into a temporary table and then link that to the outer query, but Oracle seems to frown on the use of temporary tables.

    And yes, that is the Evan's Gambit "Mr. Fischer". How are things in Iceland?
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •