Results 1 to 9 of 9
  1. #1
    Join Date
    Nov 2002
    Posts
    11

    Unanswered: Buffer Wait Ratio

    Greetings,

    I have a question with regards to monitoring buffer wait ratio (BR). When checking this, bufwait / (pageread * bufwrits), should we issue an onstat -z everytime we would want to find out the value, like:
    onstat -z
    sleep 300
    onstat -p
    onstat -z
    sleep 300
    onstat -p

    or would it be ok to perform the reset only once and still get an accurate BR, when we compute, for any time of the day? Currently, if we reset the profile, we are getting peaks of +8% for a two minute interval. If we perform a reset once in the morning and sample it every two minutes, we sometimes see the BR going as high as 38%.

    Is there any good sites we can visit for extensive read on buffer waits and it's impact on performance.

    Currently we have MAXed out our LRU (127) and Buffers, but still getting high Buffer Wait Ratio. Our Informix Dynamic Server Version 7.30.UC8 is experiencing slow response time during peak period (when a lot of users are working), wherein a process that takes a few seconds would take a few minutes.

    Aside from the BR issue (always goes above 7%), the rest of the engine appears to be properly tuned.

  2. #2
    Join Date
    Aug 2002
    Location
    Belgium
    Posts
    534
    What's the value of your LRU_MIN_DIRTY and LRU_MAX_DIRTY?
    Normally you don't need to issue an onstat -z everytime you want to check. The server cumulates it himself.
    If you do an onstat -z, you do have the exact value of the latest 300 seconds.
    check the table sysptprof in sysmaster database to check what tables have the most buffer waits.

    Later versions of IDS have other priority mechanisms for flushing buffers.

    Is it possible that you don't have lots of disks in your system?

    Check the outputs of 'onstat -FR' to monitor LRU activity during the heavy loads.
    rws

  3. #3
    Join Date
    Nov 2002
    Posts
    11
    The LRU_MIN_DIRTY = 1 and LRU_MAX_DIRTY=2. I looked at the sysmaster:sysptprof table and found no column for bufwait, can it be derived from the value of the columns in this table?

    With regards to disk, we have around 15 disks on our server.

    We did an onstat -z yesterday morning just before running a script that logs the buffer wait ration for the day and our highest reading yesterday was 18%. Would that value be considered pretty high for an online system?

    Thanks,
    Ghed

  4. #4
    Join Date
    Aug 2002
    Location
    Belgium
    Posts
    534
    Yes, correct.
    You have to monitor the bufreads and bufwrites to see what table is causing this.
    sysptprof also contains a field seqscans. This could also be a cause of yr problem.
    I usually don't tune the BR. You see it immediately in the onstat -p what values are too high.
    Is your read and write cache ok?
    Did you check the performance guide from informix?
    I think it might be interesting to consider some tests on a version 9 of IDS, to see hos it behaves with your application.
    rws

  5. #5
    Join Date
    Aug 2002
    Location
    Belgium
    Posts
    534
    There is good news in the upcoming release though. The LRU_MIN and MAX granularity can be set in percentages from that version on.
    rws

  6. #6
    Join Date
    Nov 2002
    Posts
    11
    The cache appears to be ok. Here is a capture of the onstat -p

    Profile
    dskreads pagreads bufreads %cached dskwrits pagwrits bufwrits %cached
    44011919 20823203 3157412442 98.61 628559 1075255 5219640 87.96

    isamtot open start read write rewrite delete commit rollbk
    2885109161 3475556 60706876 2696433811 578165 1124965 168175 125856 41

    gp_read gp_write gp_rewrt gp_del gp_alloc gp_free gp_curs
    0 0 0 0 0 0 0

    ovlock ovuserthread ovbuff usercpu syscpu numckpts flushes
    0 0 0 47600.72 5486.94 36 72

    bufwaits lokwaits lockreqs deadlks dltouts ckpwaits compress seqscans
    5222017 40 120519753 0 0 166 132675 539271

    ixda-RA idx-RA da-RA RA-pgsused lchwaits
    18549573 2614385 4374456 24757977 69199017

    Aside from the bufwaits being high, we also notice that there are lock waits when there's a slow response time (although it is less than 1 percent of the lock request).

    This was caputed at the time user's were complaining of slow response time.

    Thanks,
    Ghed

  7. #7
    Join Date
    Aug 2002
    Location
    Belgium
    Posts
    534
    You have more than 500000 sequential table scans! If these are scans of tables with lots of rows, I can see why you have a lot of bufwaits.
    find out what tables are sequentially scanned: sysptprof, join them with sysptnhdr, to see the size of the table.
    Your write buffer cache could be improved, but I assume that's because you do a lot of background writes...

    About the lockwaits. I personally don't see any problem if I look at your onstat -p output, but...
    Check what table is causing the lockwaits.
    Check if that table has page or row locking...
    Usually we set the table with the most lock requests to row locking. Especially when the row size is small.
    rws

  8. #8
    Join Date
    Nov 2002
    Posts
    11
    I checked the the nrows and seqscans, there's a 193,600 seqscans with 1,200,400, some has a smaller seqscans but higher nrows. Is there a formula to follow for these two columns, and indicator that would say that if we reach a certain percentage would cause bufwaits.

    Thanks for the input, really appreciate it.

  9. #9
    Join Date
    Aug 2002
    Location
    Belgium
    Posts
    534
    No there isn't. It 's just that sequentail table scan of tables with lots of rows cause buffers to fill, and maybe unneccessary buffers to get thrown out... That can cause bufwaits.
    I think some SQL tuning/ adding indexes will solve your problem.
    rws

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •