Results 1 to 3 of 3
  1. #1
    Join Date
    Jul 2004
    Posts
    3

    Unanswered: IO stalls to 100% busy when accessing a SAN disk with backup of Informix dbspace

    Hi Forum,

    Having tried many things I keep having a problem with a Emulex 10000 HBA. When doing a backup of our database with Netbackup, we see backup is a lot slower at 1 HBA.

    The symptoms are that IO's go OK until a certain point where IO's 'stall'. With iostat one sees the device in question is giving 100% busy while IO's as well as service times remain 0. This lasts for 30-50 seconds and then IO's are picking up again. See listing below. When forcing IO traffic to go through the other HBA there are no such busy %.

    Anyone could give me a clue what's happening?

    Listing of iostat -zxn 5, where the c6 controller is the one having the problem:

    extended device statistics
    r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
    0.0 6.2 0.0 43.8 0.0 0.1 0.0 23.0 0 2 c0t0d0
    0.0 6.2 0.0 43.8 0.0 0.2 0.0 31.7 0 3 c1t0d0
    1056.0 0.0 32649.3 0.0 0.0 1.6 0.0 1.5 1 92 c6t1d6
    extended device statistics
    r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
    0.0 0.2 0.0 1.6 0.0 0.0 0.0 9.5 0 0 c0t0d0
    0.0 0.2 0.0 1.6 0.0 0.0 0.0 19.4 0 0 c1t0d0
    1074.4 0.0 32673.3 0.0 0.0 1.6 0.0 1.5 1 93 c6t1d6
    extended device statistics
    r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
    0.0 0.2 0.0 1.6 0.0 0.0 0.0 9.0 0 0 c0t0d0
    0.0 0.2 0.0 1.6 0.0 0.0 0.0 17.7 0 0 c1t0d0
    124.6 0.0 3781.7 0.0 0.0 1.1 0.0 8.6 0 99 c6t1d6
    extended device statistics
    r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
    0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0 100 c6t1d6
    extended device statistics
    r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
    0.0 1.4 0.0 14.6 0.0 0.0 0.0 10.3 0 1 c0t0d0
    0.0 1.4 0.0 14.6 0.0 0.0 0.0 13.2 0 1 c1t0d0
    0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0 100 c6t1d6
    extended device statistics
    r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
    0.0 1.8 0.0 12.6 0.0 0.0 0.0 8.5 0 1 c0t0d0
    0.0 1.8 0.0 12.6 0.0 0.0 0.0 20.7 0 1 c1t0d0
    0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0 100 c6t1d6
    extended device statistics
    r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
    0.0 3.6 0.0 29.0 0.0 0.0 0.0 10.7 0 1 c0t0d0
    0.0 3.6 0.0 29.0 0.0 0.1 0.0 18.4 0 2 c1t0d0
    0.0 0.2 0.0 0.4 0.0 0.0 0.1 21.7 0 0 c6t1d0
    60.8 0.0 1884.8 0.0 0.0 1.0 0.0 17.0 0 100 c6t1d6

    The 100% busy remains there for 30-60 secs, then there is a burst of normal IO traffic with 30 MBs/sec for 20 secs, then 100% busy again, etc.

    Our configuration is:

    - Solaris 9, latest patches applied, on Fujitsu 1500 hardware.
    - 2 Emulex 10000 Light pulse cards with latest driver software 6.02h
    - Veritas Volume Manager 4.1 with latest service pack MP1
    - Veritas Netbackup 5.1 with latest maintenance pack MP4S01
    - Datbase: Informix 9.40 FC5XG. Backup goes through Netbackup/onbar scripts. Database is held on raw devices in the San, that is accessed through Veritas Volume Manager.

    If anyone can give me a clue, it will be greatly apreciated.

    Thanks Listman.

  2. #2
    Join Date
    Aug 2003
    Location
    Argentina
    Posts
    780
    Hi,

    Please check this: showrev -p | grep IDR121077
    Maybe is the Bug of SUN and Informix.

    Gustavo.

  3. #3
    Join Date
    Jul 2004
    Posts
    3
    Quote Originally Posted by gurey
    Hi,

    Please check this: showrev -p | grep IDR121077
    Maybe is the Bug of SUN and Informix.

    Gustavo.
    Hi Gustavo,

    Don't have clue what this patch or Bug could be. The showrev command doesn't return me the patch id.

    Please tell me more about it when you can.

    Thank you, Listman

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •