Results 1 to 5 of 5
  1. #1
    Join Date
    May 2003
    Location
    Phoenix, US
    Posts
    39

    Unanswered: Performacen issue when using UPPER

    Hi
    Below is the version of the DB2 used
    DB2 code release "SQL08021" with level identifier "03020106".

    A simple SQL statement which is performing a self join
    Table ample and below are the columns in the table
    id integer
    desc varchar(255)
    No. of records 15000
    Running the below query it takes 8-10minutes
    select count(1) from sample a , sample b where a.id!=b.id and upper(a.desc) = upper(b.desc)

    Table statistics reflect the current data and also a re-org is performed.

    Are there any known performance issues with UDB when using upper/lower/translate.

    Regards
    Sateesh
    OCP,IBM UDB,SYBASE DBA
    TCS, BANGALORE, INDIA

  2. #2
    Join Date
    Aug 2001
    Location
    UK
    Posts
    4,650
    Sateesh, Yes - definitely there is a performance overhead when using these functions - in terms of CPU and elapsed time .. Also, the use of these functions make the predicate Stage-2 which cannot make use of an index ...

    Now, coming to your specific case ,I do not know your system configuration .. So, not sure if this is really a problem .. Generally speaking, I'm not too very suprised with the 8-10 mins ..

    I'm sure you realize that you are asking for a cartesian product-like resultset - resulting in 15000 less than 225 million records .. that too with a != (if I make a guess that id is a unique key)

    What is the performance like, when you say a.desc=b.desc ?

    What is the access plan in each one of the cases (a.desc=b.desc and upper(a.desc)=upper(b.desc))
    Visit the new-look IDUG Website , register to gain access to the excellent content.

  3. #3
    Join Date
    May 2003
    Location
    Phoenix, US
    Posts
    39
    Pasted below is the explain plan for SQL's with & without UPPER
    System configuration is 3 processors and 16G memory.

    SQL Statement:

    SELECT a.id, b.id
    FROM sample a, sample b
    WHERE a.id !=b.id AND a.desc =b.desc


    Section Code Page = 1208

    Estimated Cost = 105.697922
    Estimated Cardinality = 13300.000000

    Access Table Name = SAMPLE ID = 7,10
    | #Columns = 1
    | Relation Scan
    | | Prefetch: Eligible
    | Lock Intents
    | | Table: Intent Share
    | | Row : Next Key Share
    | Sargable Predicate(s)
    | | #Predicates = 1
    | | Return Data to Application
    | | | #Columns = 2
    Return Data Completion

    End of section


    SQL Statement:

    SELECT a.id, b.id
    FROM sample a, sample b
    WHERE a.id !=b.id AND upper(a.desc) =upper(b.desc)


    Section Code Page = 1208

    Estimated Cost = 578523.312500
    Estimated Cardinality = 7076132.000000

    Access Table Name = SAMPLE ID = 7,10
    | #Columns = 2
    | Relation Scan
    | | Prefetch: Eligible
    | Lock Intents
    | | Table: Intent Share
    | | Row : Next Key Share
    Nested Loop Join
    | Access Table Name = SAMPLE ID = 7,10
    | | #Columns = 1
    | | Relation Scan
    | | | Prefetch: Eligible
    | | Lock Intents
    | | | Table: Intent Share
    | | | Row : Next Key Share
    | | Sargable Predicate(s)
    | | | #Predicates = 2
    | | | Return Data to Application
    | | | | #Columns = 2
    Return Data Completion

    End of section
    OCP,IBM UDB,SYBASE DBA
    TCS, BANGALORE, INDIA

  4. #4
    Join Date
    Dec 2005
    Posts
    273
    consider using

    SELECT a.id, b.id
    FROM sample a, sample b
    WHERE a.id < b.id AND upper(a.desc) =upper(b.desc)

    instead of
    ... WHERE a.id !=b.id ...

    all appropriate rows will be found nevertheless, but only half of the scan-work must be done.


    might be even enforcing a change of the accesspath from nested-loop-join to merge-scan-join improves performance.

    maybe this query does better perform:

    SELECT a.id, b.id
    FROM sample a INNER JOIN sample b
    ON upper(a.desc) = upper(b.desc)
    WHERE a.id < b.id

    or:
    SELECT a.id, b.id
    FROM sample a INNER JOIN sample b
    ON upper(a.desc) = upper(b.desc)
    WHERE a.id + 0 < b.id + 0

  5. #5
    Join Date
    Feb 2006
    Location
    Utrecht, Netherlands
    Posts
    16
    Quote Originally Posted by msateeshkumar
    Hi
    Below is the version of the DB2 used
    DB2 code release "SQL08021" with level identifier "03020106".

    A simple SQL statement which is performing a self join
    Table ample and below are the columns in the table
    id integer
    desc varchar(255)
    No. of records 15000
    Running the below query it takes 8-10minutes
    select count(1) from sample a , sample b where a.id!=b.id and upper(a.desc) = upper(b.desc)

    Table statistics reflect the current data and also a re-org is performed.

    Are there any known performance issues with UDB when using upper/lower/translate.

    Regards
    Sateesh
    How's this?
    export to sample.ixf of ixf select id, desc from sample;
    import from sample.ixf of ixf create into sampletemp;
    update sampletemp set desc = upper(desc);
    create index ixsample on sampletemp (desc asc);
    select count(1) from sampletemp a , sampletemp b where a.id!=b.id and a.desc = b.desc;

    Something like that... you might want to drop sampletemp afterwards.
    I guarantee you that this will be a whole lot faster.
    There are other ways: you might add an extra column, and perform the same trick on that. This eliminates the need for an extra table, and is easier to perform from an application.

    Another way if achieving better performance is: look at the columns first. How many of them are uppercase already? You could perform a count on them, using the index, and a count on the rest, not using the index (because of the upper() function), and add the results. This could work if most of the rows have an upper case name. If everything is mixed case, then this will not work.

    cheers,
    Rob.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •