Results 1 to 13 of 13
  1. #1
    Join Date
    Apr 2009
    Posts
    5

    Unanswered: Strange (unoptimized) query behavior

    Hi,

    I have two tables in which both have a column called 'serial' assigned as primary key.
    I do two queries using LEFT JOIN: QUERY1 with aliased column from joined table and QUERY2 without it (see sql below). As I check the Explain Plan using Oracle SQL Developer tool i realized that one does a FULL search whereas the other one does UNIQUE SCAN (probably using the primary key's index)

    [QUERY 1]:
    SELECT rmk.domainSr, rmk.serial, rmk.category
    *, reg1.name as key1Name*
    FROM ALT_RegistryMultikey rmk
    LEFT JOIN ALT_Registry reg1 ON reg1.serial = rmk.key1RegSr

    PRODUCES EXPLAIN:
    OPERATION OBJECT NAME OPTIONS COST
    ----------------- ----------- ------- ----
    SELECT STATEMENT 19
    -HASH JOIN OUTER 19
    - Access Predicates
    - REG1.SERIAL(+)=RMK.KEY1REGSR
    -TABLE ACCESS ALT_REGISTRYMULTIKEY FULL 9
    -TABLE ACCESS ALT_REGISTRY FULL 9


    [QUERY 2]:
    SELECT rmk.domainSr, rmk.serial, rmk.category
    FROM ALT_RegistryMultikey rmk
    LEFT JOIN ALT_Registry reg1 ON reg1.serial = rmk.key1RegSr

    PRODUCES EXPLAIN:
    OPERATION OBJECT NAME OPTIONS COST
    ----------------- ----------- ------- ----
    SELECT STATEMENT 9
    -NESTED LOOPS OUTER 9
    -TABLE ACCESS ALT_REGISTRYMULTIKEY FULL 9
    -INDEX SYS_C006516 UNIQUE SCAN 0
    - Access Predicates
    - REG1.SERIAL(+)=RMK.KEY1REGSR


    Why is that??
    Why only adding a column to be fetched (reg1.name as key1Name) causes that drastic change??

    Facts:
    - as I mentioned both table's serial column are primary key columns.
    - ALT_RegistryMultikey's key1RegSr is not index.
    - Using Oracle 10g on Windows but seems it also same behavior in 8i Solaris




    Create table sencence:


    CREATE SEQUENCE seq_ALT_Registry_serial;

    CREATE TABLE ALT_Registry
    (
    domainSr INTEGER NOT NULL,
    serial INTEGER NOT NULL ,
    category VARCHAR2(32) NOT NULL,
    kind VARCHAR2(32),
    ...
    PRIMARY KEY (serial)
    );

    CREATE UNIQUE INDEX unq_Registry_cateid ON ALT_Registry (domainSr, parentSr, category, id);
    -- CREATE UNIQUE INDEX unq_Registry_catehash ON ALT_Registry (domainSr, parentSr, category, hashKey);
    CREATE INDEX idx_Registry_domainSr ON ALT_Registry (domainSr);
    CREATE INDEX idx_Registry_cateHashKey ON ALT_Registry (domainSr, category, hashKey);
    CREATE INDEX idx_Registry_name ON ALT_Registry (domainSr, name);

    CREATE OR REPLACE TRIGGER trg_ALT_Registry_serial
    BEFORE INSERT ON ALT_Registry
    FOR EACH ROW WHEN (new.serial IS NULL)
    BEGIN
    SELECT seq_ALT_Registry_serial.nextval INTO :new.serial FROM dual;
    END;
    /




    -- -----------------------------------------------------------------------
    -- ALT_RegistryMultikey
    -- -----------------------------------------------------------------------

    CREATE SEQUENCE seq_ALT_Registr_ultikey_serial;

    CREATE TABLE ALT_RegistryMultikey
    (
    domainSr INTEGER NOT NULL,
    serial INTEGER NOT NULL ,
    category VARCHAR2(32) NOT NULL,
    kind VARCHAR2(32),
    langCd VARCHAR2(3),
    key1 VARCHAR2(64) NOT NULL,
    key1RegSr INTEGER NOT NULL,
    key2 VARCHAR2(64),
    key2RegSr INTEGER,
    ..
    name VARCHAR2(127) NOT NULL,
    code VARCHAR2(64),
    val CLOB NOT NULL,
    hashKey INTEGER,
    description VARCHAR2(255),
    PRIMARY KEY (serial)
    );

    CREATE UNIQUE INDEX unq_cateid ON ALT_RegistryMultikey (domainSr, category, name);
    CREATE INDEX idx_domainSr ON ALT_RegistryMultikey (domainSr);

    CREATE OR REPLACE TRIGGER trg_ALT_Registr_ultikey_serial
    BEFORE INSERT ON ALT_RegistryMultikey
    FOR EACH ROW WHEN (new.serial IS NULL)
    BEGIN
    SELECT seq_ALT_Registr_ultikey_serial.nextval INTO :new.serial FROM dual;
    END;
    /


    Thanks

  2. #2
    Join Date
    Sep 2002
    Location
    UK
    Posts
    5,171
    Provided Answers: 1
    With your second query, the optimiser recognises that your outer join to ALT_Registry is pointless, i.e. the result will be the same whether it performs the outer join or not - and since not joining is quicker, that's what it does.

  3. #3
    Join Date
    Apr 2009
    Posts
    5

    Why not do a UNIQUE SCAN?

    Quote Originally Posted by andrewst
    With your second query, the optimiser recognises that your outer join to ALT_Registry is pointless, i.e. the result will be the same whether it performs the outer join or not - and since not joining is quicker, that's what it does.
    But why the first query does a full search even though the serial is indexed (primary key)?
    Shouldn't it use the index and do a UNIQUE SCAN?

    I have similar querys (of other tables) joinin on primary key column and does a UNIQUE SCAN.

  4. #4
    Join Date
    Nov 2003
    Posts
    2,933
    Provided Answers: 12
    Quote Originally Posted by altenia
    But why the first query does a full search even though the serial is indexed (primary key)?
    Shouldn't it use the index and do a UNIQUE SCAN?
    Depends on the number of rows the optimizer expects to be returned by the query. Doing a FTS is more efficient when you retrieve more than approx. 15% of the total number rows in the table (some sources say the break even is at 10% others claim it to be as high as 20%), or if the the table has only a few rows ( < 1000)

    If your query returns fewer rows, then I would assume your statistics are way off. You should check the "Rows" column from the execution plan and verify that with the actual number of rows.

    Doing an index lookup adds an overhead of 2-3 IOs per row that is retrieved.

    You can check with SQL*Plus autotrace or TKProf if the "bad" plan might actually be more efficient in terms of IOs needed

  5. #5
    Join Date
    Sep 2002
    Location
    UK
    Posts
    5,171
    Provided Answers: 1
    Because the optimiser has deduced that many rows from the outer table will be accessed, and so has chosen to do a full scan of both tables and perform a hash join.

    I don't know your tables, but imagine there are 1000 rows in table ALT_Registry, with about 100 rows fitting on a database block, so the table has 10 blocks. Now suppose your query will want to access 200 of these rows. It is going to be quicker to do 10 reads and get all the rows than it is to perform 200 index unique lookups followed by 200 table access by rowids.

    If your query was:

    SELECT rmk.domainSr, rmk.serial, rmk.category
    *, reg1.name as key1Name*
    FROM ALT_RegistryMultikey rmk
    LEFT JOIN ALT_Registry reg1 ON reg1.serial = rmk.key1RegSr
    WHERE rmk.domainSr = 123;

    then it might be a very different story, since a far smaller number of rows will be accessed in both tables (probably).

  6. #6
    Join Date
    Sep 2002
    Location
    UK
    Posts
    5,171
    Provided Answers: 1
    Quote Originally Posted by shammat
    Doing a FTS is more efficient when you retrieve more than approx. 15% of the total number rows in the table (some sources say the break even is at 10% others claim it to be as high as 20%)...
    It could be more efficient for 1% of the rows in some cases. Take the case of a table with a column c that has 100 distinct values, evenly distributed, and where each block contains ~100 rows. A query using "where c = 123" will return ~1% of the rows, but those rows may be scattered throughout all the blocks of the table.

  7. #7
    Join Date
    Apr 2009
    Posts
    5

    Query takes 60seconds to execute

    Quote Originally Posted by shammat
    Depends on the number of rows the optimizer expects to be returned by the query. Doing a FTS is more efficient when you retrieve more than approx. 15% of the total number rows in the table (some sources say the break even is at 10% others claim it to be as high as 20%), or if the the table has only a few rows ( < 1000)

    If your query returns fewer rows, then I would assume your statistics are way off. You should check the "Rows" column from the execution plan and verify that with the actual number of rows.

    Doing an index lookup adds an overhead of 2-3 IOs per row that is retrieved.

    You can check with SQL*Plus autotrace or TKProf if the "bad" plan might actually be more efficient in terms of IOs needed
    Thanks for your explanation.
    Indeed the ALT_RegistryMultikey has total of 600 records.
    and ALT_Registry has total of 800 records.

    But the reason I started this analysis was because the query was taking 60 seconds!

    Actually, the query I mentined in my original thread is a simplified one, the real query I initially tested has more joins and takes 80 seconds to execute in Oracle 8i/Sun and 10 seconds in Oracle 10i/Windows
    (FYI: same data, same query, same indexes in MySQL takes 2 seconds)

    The actual Select sentence goes like this:

    SELECT
    rmk.domainSr
    , rmk.serial
    , rmk.registBySr
    , rmk.registDt
    , rmk.lastModifBySr
    , rmk.lastModifDt
    , rmk.modifSeq
    , rmk.parentSr
    , rmk.context
    , rmk.category
    , rmk.kind
    , rmk.langCd
    , rmk.key1
    , rmk.key1RegSr
    , rmk.key2
    , rmk.key2RegSr
    , rmk.key3
    , rmk.key3RegSr
    , rmk.key4
    , rmk.key4RegSr
    , rmk.key5
    , rmk.key5RegSr
    , rmk.key6
    , rmk.key6RegSr
    , rmk.key7
    , rmk.key7RegSr
    , rmk.key8
    , rmk.key8RegSr
    , rmk.key9
    , rmk.key9RegSr
    , rmk.name
    , rmk.code
    , rmk.val
    , rmk.hashKey
    , rmk.description
    , rmk.weight
    , rmk.remark
    , rmk.attributes
    , rmk.ref1
    , rmk.ref1Context
    , rmk.ref1ContextRecSr
    , rmk.ref2
    , rmk.ref2Context
    , rmk.ref2ContextRecSr
    , rmk.ref3
    , rmk.ref3Context
    , rmk.ref3ContextRecSr
    , reg1.name as key1Name
    , reg2.name as key2Name
    , reg3.name as key3Name
    , reg4.name as key4Name
    , reg5.name as key5Name
    , reg6.name as key6Name
    , reg7.name as key7Name
    , reg8.name as key8Name
    , reg9.name as key9Name
    FROM ALT_RegistryMultikey rmk
    LEFT JOIN ALT_Registry reg1 ON reg1.serial = rmk.key1RegSr
    LEFT JOIN ALT_Registry reg2 ON reg2.serial = rmk.key2RegSr
    LEFT JOIN ALT_Registry reg3 ON reg3.serial = rmk.key3RegSr
    LEFT JOIN ALT_Registry reg4 ON reg4.serial = rmk.key4RegSr
    LEFT JOIN ALT_Registry reg5 ON reg5.serial = rmk.key5RegSr
    LEFT JOIN ALT_Registry reg6 ON reg6.serial = rmk.key6RegSr
    LEFT JOIN ALT_Registry reg7 ON reg7.serial = rmk.key7RegSr
    LEFT JOIN ALT_Registry reg8 ON reg8.serial = rmk.key8RegSr
    LEFT JOIN ALT_Registry reg9 ON reg9.serial = rmk.key9RegSr

  8. #8
    Join Date
    Sep 2002
    Location
    UK
    Posts
    5,171
    Provided Answers: 1
    Ugh, what a nasty table design

    But anyway - are the optimiser stats up to date on these tables? It really shouldn't be taking that long for Oracle to join an 800 row and a 600 row table, whatever the access method.

  9. #9
    Join Date
    Nov 2003
    Posts
    2,933
    Provided Answers: 12
    Quote Originally Posted by altenia
    But the reason I started this analysis was because the query was taking 60 seconds!
    Sounds like a problem with a high-water mark. Is it possible that the table has been quite big at a certain time, and then a big chunk was deleted?

    http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:492636200346818072

    http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:669044239081

    You might want to try one of the methods mentioned in those posts to lower the HWM (if that really is the problem)

    And I agree with andrewst that this is a very ugly database design.

  10. #10
    Join Date
    Apr 2009
    Posts
    5
    Quote Originally Posted by andrewst
    Ugh, what a nasty table design

    But anyway - are the optimiser stats up to date on these tables? It really shouldn't be taking that long for Oracle to join an 800 row and a 600 row table, whatever the access method.
    Yes hasty design...
    Anyway, how do I know if the optimiset stats are up to date?
    I have other queries left joining the ALT_Registry table on serial and they are fast...

  11. #11
    Join Date
    Sep 2002
    Location
    UK
    Posts
    5,171
    Provided Answers: 1
    To see if the optimizer stats are up to date on a table (e.g. table EMP):
    Code:
    SQL> select num_rows, last_analyzed
      2  from user_tables where table_name='EMP'
      3  /
    
      NUM_ROWS LAST_ANALYZ
    ---------- -----------
            35 16-FEB-2009
    
    1 row selected.
    This shows that the table was last analyzed over 2 months ago, when it had 35 rows. It doesn't matter if the stats are old, as long as they are still fairly accurate.

  12. #12
    Join Date
    Apr 2009
    Posts
    5

    The tables are quite new

    Quote Originally Posted by andrewst
    To see if the optimizer stats are up to date on a table (e.g. table EMP):

    This shows that the table was last analyzed over 2 months ago, when it had 35 rows. It doesn't matter if the stats are old, as long as they are still fairly accurate.
    Thanks andreews and shammat.
    I will try that when I go bat to the production site.
    Anyhow, if the optimizer are not accurate, how do I update the optimizer?

    And about shammats advice, what command is used to rebuild the table for the HWM issue?

    Anyway, the tables are quite new for they were added for a recent module the customer asked and they are quite static.

  13. #13
    Join Date
    Sep 2002
    Location
    UK
    Posts
    5,171
    Provided Answers: 1
    To update the optimiser stats you can use DBMS_STATS package - e.g.
    Code:
    exec dbms_stats.gather_schema_stats('MYSCHEMA')
    Or for a single table:
    Code:
    exec dbms_stats.gather_table_stats('MYSCHEMA','MYTABLE')
    There are lots of options on these that you can read up on, but the default calls as above are usually sufficient.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •