Results 1 to 5 of 5
  1. #1
    Join Date
    Apr 2010
    Posts
    3

    Unanswered: Scan performance

    Hi,

    I need some help to make a performance test for a table scan.


    I have a db2 database where I have created a table called HOTEL.

    CREATE TABLE (hid int not null, name varchar(20) not null, rooms int, stars, address varchar(100), zipcode(10))

    Then I insert 1.000.000 records into the table.
    Then I perform a table scan: SELECT hid, name from Hotel

    I want to speedup the scan (with a cold buffer), so when I look at the monitor (db2top -d <database>) I should see some activities for P_Reads.

    Now, I am the only one on the database, so I have changed the ISOLATION LEVEL to UR (uncommitted read), and it gives me some speedup.

    My question is. Could it be a good idea to create a two-column index on the table? :
    CREATE INDEX I1 ON Hotel (hid, name)

    I have tried this, but when I run the query it perform a deadlock in the db.
    So I think I doing wrong.

    Can you tell me how to do it, and eventually tell me other ways to optimize the table scan?

    Tanks in advance
    Jesper

  2. #2
    Join Date
    Jun 2003
    Location
    Toronto, Canada
    Posts
    5,516
    Provided Answers: 1
    In real life you don't select a million hotel names. If you have an actual query that needs optimization, let's look at it.

    You mention some "deadlock", yet you also say that you are the only one connected. Could you please elaborate?

    In general, table scan performance can only be improved by increasing the bufferpool or by tuning prefetch performance.
    ---
    "It does not work" is not a valid problem statement.

  3. #3
    Join Date
    Apr 2010
    Posts
    3

    Re:-)

    Hi N_i,

    This has nothing to do with real life!
    I got an assignment where I have to scan 1 million tuples as quickly as I can.
    It doesn't matter which data we are talking about, I just gave an example of a table named hotel.

    The question is: Scan as fast you can 1 million tuples with a cold buffer. That is, you have to be sure you make a physical read and not a buffer read.
    So in this exercise I think it would not change the speedup if I changed the size of the bufferpool - or does it?

    For this exercise I have access to Amazon AWS web cloud where I have an instance running a db2. So when I make the test there are nobody to interfere the result.

    I figure it out. I should have made the index unique:
    CREATE UNIQUE INDEX c ON employees (ssnum ASC, name ASC)

    But it seem it doesn't yield a speedup as I expected. Any suggestions?

    / Jesper

  4. #4
    Join Date
    Jun 2003
    Location
    Toronto, Canada
    Posts
    5,516
    Provided Answers: 1
    Quote Originally Posted by jesperd View Post
    The question is: Scan as fast you can 1 million tuples with a cold buffer.
    Sorry, your requirement still does not make any sense to me. Why do you need this? What do you mean by "scan"? How do you measure performance?

    Quote Originally Posted by jesperd View Post
    I think it would not change the speedup if I changed the size of the bufferpool - or does it?
    If the bufferpool size is insufficient to hold the entire object DB2 will need to select what pages to replace, which takes time, however little.

    Quote Originally Posted by jesperd View Post
    For this exercise I have access to Amazon AWS web cloud where I have an instance running a db2
    This makes even less sense. DB2 cannot scan anything faster than it can read data from disk, and in this case you are at the mercy of some virtual "disk" with unknown parallelism. Disk performance is completely outside the DB2 control, so how do you want to tune something you cannot control? This is exactly why normally you minimize the amount of data you read.

  5. #5
    Join Date
    Apr 2010
    Posts
    3
    Quote Originally Posted by n_i View Post
    This makes even less sense. DB2 cannot scan anything faster than it can read data from disk, and in this case you are at the mercy of some virtual "disk" with unknown parallelism. Disk performance is completely outside the DB2 control, so how do you want to tune something you cannot control? This is exactly why normally you minimize the amount of data you read.
    Okay I see your point.

    I want to get the data from 2 columns i the table. I want to know how much overhead there is when reading a table in db2.
    To measure this I use a script that makes a connection to the db and gives me the data form these 2 columns and the same time returns the time consumed. The time is only measured when the query is executed.

    As I know, 1 million tuples with a integer and a varchar(20) use approximately 24 Mb of disk space. My disk system can
    read (intelligently) 237000 kb/s, so reading 24 Mb will take almost 0.1 second.

    When I make the test I have to be sure the cache is empty. So When I execute the transaction I have to do it on a newly started db.
    I do this with these commands:

    db2stop
    exec db2nkill 0
    db2start

    The command db2stop does not clear the bufferpool entirely, so I have to make the hack with the db2nkill command.
    I am now ready to the make the experiment.
    have made so far is:

    (default settings)
    With a COLD BUFFER: 0.18241258 sec.
    With a WARM BUFFER: 0.0639812 sec.

    With uncommitted read (UR)
    With a COLD BUFFER: 0.12914299 sec.
    With a WARM BUFFER: 0.06244802 sec.

    With UR I have almost reached my goal. With UR I have almost reached my goal of what disk system can perform.


    I hope this was more clear to you :-)

    / Jesper

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •