Page 1 of 2 12 LastLast
Results 1 to 15 of 16
  1. #1
    Join Date
    Aug 2011
    Posts
    3

    Unanswered: Performance Issue: 200 columns versus Single column

    I have a table say Order. I want to search for orders based on the states it goes through. Order can go through 200 different states. I want to search orders which have reached some states say S1, S2,S3 and not reached some other states say S4,S7,S10.

    I have two approaches to implement this but I am not sure which one would be best considering performance.

    Approach 1: Add 200 columns in Order table for each state say Column1, Column2 etc and columns would be boolean i.e. value would be Y or N.

    Then while querying the table I can use Column1=Y and Column2=Y and Column3=Y and Column4=N and Column7=N and Column10=N

    Approach 2: Add only 1 column in Order table say State_Summary with size as 100 characters. I will map each state to a bit mask position in binary. For example S1 as 2^0=1 i.e 01, S2 as 2^1=2 i.e 100, S3 as 2^2=4 i.e 1, S4=2^3=8 and like that. Then whenever order reaches a state I will turn on its corresponding bit mask position. For example if order reached S1,S4, S7 I will perform OR on corresponding bit mask positions. For example 01 OR 100 OR 10000000 which is 10000101. While storing value in State_Summary I will have leading zero's added before 10000101 like ....0000000000000000010000101 to store it as 100 char value.

    While querying the table I can use like search on State_Summary column.

    Please suggest which approach would be better ? What would be the advantages and disadvantages of the two approaches?

    Thanks.

  2. #2
    Join Date
    Jan 2004
    Location
    Croatia, Europe
    Posts
    4,094
    Provided Answers: 4
    Approach 3: create another table: suppose that these tables already exist:
    Code:
    -- State ID and its name: 
    -- 1 - order opened
    -- 2 - order signed by person A
    -- 3 - order sent to person B
    -- ...
    -- 100 - finish
    create table state
      (id         number primary key,
       state_name varchar2(50)
      );
      
    -- Who placed the order, when, ...
      create table order
      (id          number primary key,
       order_date  date,
       customer_id number constraint fk_or_cust references customer (id),
       ...
      );
    Now, create another table:
    Code:
    -- This is my suggestion: every new state is written into a separate table.
    -- Foreign keys tell you which values go into which columns  
    create table order_state
      (id         number,
       order_id   number  constraint fk_os_order references order (id),
       state_id   number  constraint fk_os_state references state (id),
       state_date date
      );
    From my point of view, this approach seems to be closest to the 3rd normal form and, hopefully, most efficient.

  3. #3
    Join Date
    Apr 2008
    Location
    Iasi, Romania
    Posts
    561
    Provided Answers: 2
    You may try a performance view:
    Approach 1: a higher row size - less rows per page - more I/O. Also, the SELECT will certainly do a table scan
    Approach 2: the LIKE condition will certainly do a table scan
    Approach 3: you will have joins BUT with the proper indexes, you shouldn't have table scans

    Now, depends on the number of rows in the Order table if you will prefer joins or table scans.
    Florin Aparaschivei
    DB2 9.7, 10.5 on Windows
    Iasi, Romania

  4. #4
    Join Date
    Aug 2011
    Posts
    3
    My first design was similar to suggested by Littlefoot as Approach 3. But after implementation I faced query timeout issue. Although I had added Index but in query I was joining Order and Order_State tables and on that was using EXISTS in where clause (for orders that reached some states and not reached some other states). So the cost of qeury was coming out too much.

    @aflorin27: Can you please explain more on how with Approach 1 SELECT will do a full table scan ?

  5. #5
    Join Date
    Apr 2008
    Location
    Iasi, Romania
    Posts
    561
    Provided Answers: 2
    Your query WHERE clause will be something like
    Column1=Y and Column2=Y and Column3=Y and Column4=N and Column7=N and Column10=N

    it is impossible to have a B-Tree index to cover all the possibilities. No index = table scan. Although you may try to use a bitmap index - you should test this possibility.
    Florin Aparaschivei
    DB2 9.7, 10.5 on Windows
    Iasi, Romania

  6. #6
    Join Date
    Jun 2003
    Location
    West Palm Beach, FL
    Posts
    2,713

    Wink

    I like "Approach 2: Add only 1 column in Order table say State_Summary with size as 100 characters.". Except instead of size 100, just NUMBER type.

    Then I could use the BIN_TO_NUM() and BITAND() functions to do the comparisons.
    The person who says it can't be done should not interrupt the person doing it. -- Chinese proverb

  7. #7
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    I thought that a little modification for order_state table in Approach 3 might be better, like
    Code:
    CREATE TABLE order_state
    ( order_id   number  constraint fk_os_order references order (id)
    , state_id   number  constraint fk_os_state references state (id)
    , state_date date
    , PRIMARY KEY (order_id , state_id)
    );
    Quote Originally Posted by db2queen View Post
    My first design was similar to suggested by Littlefoot as Approach 3. But after implementation I faced query timeout issue. Although I had added Index but in query I was joining Order and Order_State tables and on that was using EXISTS in where clause (for orders that reached some states and not reached some other states). So the cost of qeury was coming out too much.
    How about this query?

    Example 1:
    Code:
    SELECT r.*
     FROM  order r
     WHERE EXISTS
           (SELECT 0
             FROM  order_state s
             WHERE s.order_id =  r.id
               AND s.state_id IN ( 1 , 2 , 3 , 4 , 7 ,10)
             HAVING
                   COUNT( CASE WHEN s.state_id IN ( 1 , 2 , 3) THEN 0 END ) = 3
               AND COUNT( CASE WHEN s.state_id IN ( 4 , 7 ,10) THEN 0 END ) = 0
           )
    ;
    Last edited by tonkuma; 01-11-12 at 22:09. Reason: Remove GROUP BY clause in the sample code.

  8. #8
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    Example 2 may be a little better than Example 1,
    because Two COUNT functions vs One SUM function.

    Example 2:
    Code:
    SELECT r.*
     FROM  order r
     WHERE EXISTS
           (SELECT 0
             FROM  order_state s
             WHERE s.order_id =  r.id
               AND s.state_id IN ( 1 , 2 , 3 , 4 , 7 ,10)
             HAVING
                   SUM( CASE
                        WHEN s.state_id IN ( 1 , 2 , 3) THEN
                             +1
                        WHEN s.state_id IN ( 4 , 7 ,10) THEN
                             -1
                        END
                      ) = 3
           )
    ;

  9. #9
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    Another query idea for my modified Approach 3 design.

    Example 3:
    Code:
    SELECT r.*
     FROM  order r
     WHERE EXISTS(
              SELECT 0
               FROM  order_state s
               WHERE s.order_id =  r.id
                 AND s.state_id IN ( 1 , 2 , 3 , 4 , 7 ,10)
               HAVING
                     LISTAGG( TO_CHAR(s.state_id) , ':' )
                        WITHIN GROUP(ORDER BY s.state_id)
                     = '1:2:3'
           )
    ;

  10. #10
    Join Date
    Jun 2003
    Location
    West Palm Beach, FL
    Posts
    2,713
    Quote Originally Posted by LKBrwn_DBA View Post
    I like "Approach 2: Add only 1 column in Order table say State_Summary with size as 100 characters.". Except instead of size 100, just NUMBER type.

    Then I could use the BIN_TO_NUM() and BITAND() functions to do the comparisons.
    Following up on this, you could also use TO_NUMBER + BITAND:
    (If bit is number is from 1 to nn then 8, 16, 32... is "power(2, bit-1)".

    Code:
    SQL> select  sign(bitand(to_number('99','XX'),8)) bit4
      2    from  dual
      3  /
    
          BIT4
    ----------
             1
    
    SQL> select  sign(bitand(to_number('99','XX'),16)) bit5
      2    from  dual
      3  /
    
          BIT5
    ----------
             1
    
    SQL> select  sign(bitand(to_number('99','XX'),32)) bit6
      2    from  dual
      3  /
    
          BIT6
    ----------
             0
    The person who says it can't be done should not interrupt the person doing it. -- Chinese proverb

  11. #11
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    Some questions for Approach 2.

    (1) Which is better for State_Summary column?
    Order can go through 200 different states.
    (1-1) 200 characters

    (1-2) Two numbers(Seven numbers, if considered ease of handling in PL/SQL.)
    Notes on the BITAND Function
    ■ The current implementation of BITAND defines n = 128.
    ■ PL/SQL supports an overload of BITAND for which the types of the inputs and of
    the result are all BINARY_INTEGER and for which n = 32.

    (2) How to turn on S50 for an Order, when the order already reached S1,S2 and S3?


    (3) How to search orders which have reached some states say S1, S2,S3 and not reached some other states say S4,S7,S10?

    __________________
    I don't want to interrupt the person doing it.
    I want to learn how the person doing it.
    Last edited by tonkuma; 01-13-12 at 22:01.

  12. #12
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    I tried to find an answer to my questions by myself.

    I thought that both of my solutions have the performance isuue(table scan) in serching, like pointed in
    Quote Originally Posted by aflorin27 View Post
    You may try a performance view:
    ...
    Approach 2: the LIKE condition will certainly do a table scan
    ...

    Now, depends on the number of rows in the Order table if you will prefer joins or table scans.
    Are there any better idea?


    ------------------------------------------------------------------------------------------------------
    (1-1) 200 characters

    (2) How to turn on S50 for an Order, when the order already reached S1,S2 and S3?
    Code:
    UPDATE order r
       SET state_summary
         = SUBSTR(state_summary , 1 , 200 - 50)
           || '1' ||
           SUBSTR(state_summary , 202 - 50 , 50 - 1)
     WHERE r.id = xxx;
    (3) How to search orders which have reached some states say S1, S2,S3 and not reached some other states say S4,S7,S10?
    Code:
    SELECT id , order_date
     FROM  order
     WHERE state_summary LIKE '%0__0__0111'
                   /* or, 190 leading underscore("_") */
                   /* like... '______________________________0__0__0111' */
    ;

    ------------------------------------------------------------------------------------------------------
    (1-2) Two numbers(Seven numbers, if considered ease of handling in PL/SQL.)

    (2) How to turn on S50 for an Order, when the order already reached S1,S2 and S3?
    Code:
    UPDATE order r
       SET state_summary
         = state_summary + POWER(2 , 50)
     WHERE r.id = xxx
       AND BITAND( state_summary , POWER(2, 50) )
           = 0
    ;
    (3) How to search orders which have reached some states say S1, S2,S3 and not reached some other states say S4,S7,S10?
    Code:
    SELECT id , order_date
     FROM  order r
     WHERE BITAND(  state_summary
                 ,  POWER(2 , 0) + POWER(2 , 1) + POWER(2 , 2)
                  + POWER(2 , 3) + POWER(2 , 6) + POWER(2 , 9)
                 )
                 =  POWER(2 , 0) + POWER(2 , 1) + POWER(2 , 2)
    ;
    Last edited by tonkuma; 01-14-12 at 21:36. Reason: Add alternate pattern of LIKE in sample code of (1-1),(3). Reformat sample code of (1-1),(2)

  13. #13
    Join Date
    Jun 2003
    Location
    West Palm Beach, FL
    Posts
    2,713

    Cool

    Quote Originally Posted by tonkuma View Post
    Some questions for Approach 2.

    (1) Which is better for State_Summary column?

    (1-1) 200 characters

    (1-2) Two numbers(Seven numbers, if considered ease of handling in PL/SQL.)

    (2) How to turn on S50 for an Order, when the order already reached S1,S2 and S3?

    (3) How to search orders which have reached some states say S1, S2,S3 and not reached some other states say S4,S7,S10?
    (1-2) You can use RAW datatype and UTL_RAW package.

    (2) Same as above, use UTL_RAW bit functions.

    (3) Using binary arithmetic.

    The person who says it can't be done should not interrupt the person doing it. -- Chinese proverb

  14. #14
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    (2) Same as above, use UTL_RAW bit functions.
    I guessed that you suggested the way to turn on S50 was something like...
    SET State_Summary = UTL_RAW.BIT_OR(State_Summary , <hex value bit 50 is 1 and others are all 0>)
    in an UPDATE query.
    If so, any idea to make the hex value?

    (3) Using binary arithmetic.
    I couldn't make the WHERE cluase for a query
    "to search orders which have reached some states say S1, S2,S3 and not reached some other states say S4,S7,S10"
    Would you show me a concrete example of the WHERE clause?
    Code:
    SELECT id , order_date
     FROM  order r
     WHERE .....

  15. #15
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    If used RAW datatype, I thought that State_Summary column should be RAW(25) to include 200 states.

    Then, I tried to find answers.

    Question:
    (2) How to turn on S50 for an Order, when the order already reached S1,S2 and S3?
    Answer:
    (2) Same as above, use UTL_RAW bit functions.
    50 / 4 = 12 ... 2
    So, right most values to be ORed might be
    02000000000000
    Padded left with '0' to the lenght 25, an UPDATE statement might be like...
    Code:
    UPDATE order
       SET State_Summary
           = UTL_RAW.BIT_OR(
                State_Summary
              , HEXTORAW('00000000000000000000000000000000000002000000000000')
             )
     WHERE ...

    Question:
    (3) How to search orders which have reached some states say S1, S2,S3 and not reached some other states say S4,S7,S10?
    Answer:
    (3) Using binary arithmetic.
    Thinking similar ways as (2), I made a query.
    But, I thought it might have performance issue(table scan).
    Code:
    SELECT id , order_date
     FROM  order r
     WHERE UTL_RAW.BIT_AND(
              State_Summary
            , HEXTORAW('0000000000000000000000000000000000000000000000024F')
           )= HEXTORAW('00000000000000000000000000000000000000000000000007')
    I'm a begginer of Oracle. Are there any other idea?

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •