Page 1 of 2 12 LastLast
Results 1 to 15 of 22
  1. #1
    Join Date
    May 2011
    Posts
    12

    Unanswered: ADM10500E - db.auto_storage issue

    Hi all!

    I get the following error in my logs:

    Code:
    ADM10500E  Health indicator "Database Automatic Storage Utilization" ("db.auto_storage_util") breached the "upper" alarm threshold of "90 %" with value "99 %" on "database" "db2inst1.C60     ".  Calculation: "((db.auto_storage_used/db.auto_storage_total)*100);" = "((13242013745152 / 13409961639936) * 100)" = "99 %".  History (Timestamp, Value, Formula): "()"
    the associated recommendations are all about freeing up space or adding storage, but there is already plenty of space available on the disk:

    Code:
    db2inst1@XXXXX:/usr/local/dbs/db2/db2inst1/db2inst1/NODE0000/C60$ df -h .
    Filesystem            Size  Used Avail Use% Mounted on
    storage1.stg:/vol/naf001/dbs/db2/XXXXX
                           13T   13T  157G  99% /usr/local/dbs/db
    I should mention that db2 itself is installed on another storage (which itself has about 60G left)

    Any help would be greatly appreciated!
    Thanks

    PS: DB2 version: DB2/LINUXX8664 9.7.2

  2. #2
    Join Date
    May 2003
    Location
    USA
    Posts
    5,737
    As long as you have auto-resize on all your tablespaces, you are OK. IMO Health Monitor is a virus. It is depricated in 9.7:
    IBM DB2 9.7 for Linux, UNIX and Windows Information Center

    A lot of people shut off the Health monitor in the dbm config.
    M. A. Feldman
    IBM Certified DBA on DB2 for Linux, UNIX, and Windows
    IBM Certified DBA on DB2 for z/OS and OS/390

  3. #3
    Join Date
    May 2011
    Posts
    12
    Hi Marcus_A,

    Thanks a lot for the help, I was not aware of it being deprecated. I therefore disabled it as well. I also ensured that my tablespaces were on AUTORESIZE.

    However what prompted me to check this db2 log files in the first place is that my load operations seem to stop somewhere while running, in a manner that is not consistent. A bit like if it was running out of memory. In fact, I don't think it fails per se, I think it just becomes so slow that it looks like it freezes. However, a "free" indicates plenty of memory available, and the very same loads use to work in the past...

    Any suggestions? are there other logs I may check? mysql for instance has a nice "show processlist" that shows what queries are running, an equivalent in db2 would at least confirm that my load query is indeed still running.

    Thanks!!

  4. #4
    Join Date
    Jun 2003
    Location
    Toronto, Canada
    Posts
    5,516
    Provided Answers: 1
    Quote Originally Posted by anthony82 View Post

    Any suggestions? are there other logs I may check? mysql for instance has a nice "show processlist" that shows what queries are running, an equivalent in db2 would at least confirm that my load query is indeed still running.
    LOAD operations are logged in db2diag.log

    You can also use the db2pd tool.

  5. #5
    Join Date
    Aug 2008
    Location
    Toronto, Canada
    Posts
    2,369
    Quote Originally Posted by anthony82 View Post
    However what prompted me to check this db2 log files in the first place is that my load operations seem to stop somewhere while running, in a manner that is not consistent. A bit like if it was running out of memory. In fact, I don't think it fails per se, I think it just becomes so slow that it looks like it freezes. However, a "free" indicates plenty of memory available, and the very same loads use to work in the past...

    Try increasing msgmnb kernel parameter if load appears to be hanging. See the following technote for more info:
    https://www-304.ibm.com/support/docv...id=swg21438228

  6. #6
    Join Date
    May 2011
    Posts
    12
    wow, thanks a lot for the responses! it gives me some really good leads to investigate. i will look into msgmnb, db2diag.log and db2pd and will keep you guys posted.

  7. #7
    Join Date
    May 2011
    Posts
    12
    Hi again,

    So I checked my msgmnb and it's already pretty high (65K) so i don't think the problem comes from there. However, by looking closely at db2diag.log, i found the following error:

    Code:
    2011-05-12-10.14.41.762385-240 I244994013E467      LEVEL: Warning
    PID     : 13903                TID  : 139979321764176PROC : db2sysc
    INSTANCE: db2inst1             NODE : 000          DB   : C60
    APPHDL  : 0-2214               APPID: *LOCAL.db2inst1.110512140515
    AUTHID  : DB2INST1
    EDUID   : 20                   EDUNAME: db2agent (C60)
    FUNCTION: DB2 UDB, database utilities, sqluvtld_route_in, probe:841
    DATA #1 : <preformatted>
    Starting LOAD operation (S) (1) (I).
    
    2011-05-12-10.14.42.136989-240 I244994481E548      LEVEL: Warning
    PID     : 13903                TID  : 139979183352144PROC : db2sysc
    INSTANCE: db2inst1             NODE : 000          DB   : C60
    APPHDL  : 0-2214               APPID: *LOCAL.db2inst1.110512140515
    AUTHID  : DB2INST1
    EDUID   : 306                  EDUNAME: db2lfrm0
    FUNCTION: DB2 UDB, database utilities, sqlulPrintPhaseMsg, probe:314
    DATA #1 : String, 120 bytes
    LOADID: 20.2011-05-12-10.14.41.762370.0 (2;14) 
    
    Starting LOAD phase at 05/12/2011 10:14:42.136817. Table C60HSAP .ditag
    
    2011-05-12-10.14.46.655907-240 I244995030E336      LEVEL: Error
    PID     : 13918                TID  : 1092196688   PROC : db2acd
    INSTANCE: db2inst1             NODE : 000
    FUNCTION: DB2 UDB, oper system services, sqloPdbBindSocket, probe:50
    MESSAGE : Unable to bind socket 6 to UNIX path 
              /home/dasusr1/das/tmp/file3lYtZE13918
    
    2011-05-12-10.14.46.656017-240 E244995367E427      LEVEL: Error (OS)
    PID     : 13918                TID  : 1092196688   PROC : db2acd
    INSTANCE: db2inst1             NODE : 000
    FUNCTION: DB2 UDB, oper system services, sqloPdbBindSocket, probe:20
    MESSAGE : ZRC=0x860F000A=-2045837302=SQLO_FNEX "File not found."
              DIA8411C A file "" could not be found.
    CALLED  : OS, -, bind                             OSERR: ENOENT (2)
    
    2011-05-12-10.14.46.656256-240 I244995795E1075     LEVEL: Error
    PID     : 13918                TID  : 1092196688   PROC : db2acd
    INSTANCE: db2inst1             NODE : 000
    FUNCTION: DB2 Tools, DB2 administration server, DasCommNamedPipeAdapter::connect1, probe:30
    DATA #1 : signed integer, 4 bytes
    -2045837302
    CALLSTCK: 
      [0] 0x00007F0CCFA2A33E pdOSSeLoggingCallback + 0x100
      [1] 0x00007F0CCF0E1A42 /usr/local/home/db2/db2inst1/sqllib/lib64/libdb2osse.so.1 + 0x1C4A42
      [2] 0x00007F0CCF0E325E ossLog + 0xA6
      [3] 0x00007F0CC75CBC15 _ZN23DasCommNamedPipeAdapter7connectEiP5sqlca + 0x405
      [4] 0x00007F0CC75BB55B _ZN20DasClientCommManager10dasRequestEP16DasCommInterfaceP15dasRequestInputP13requestOutput + 0x9D
      [5] 0x00007F0CC75BB1DF _ZN14DasCommAdapter10dasRequestEP15dasRequestInputP13requestOutput + 0x11B
      [6] 0x00007F0CC75B89D2 _ZN6DasAPI3runEv + 0x1B6
      [7] 0x00007F0CC923FC76 _ZN13hiAlertCfgMgr21buildNotificationListEv + 0xFEC
      [8] 0x00007F0CC923EAA9 _ZN13hiAlertCfgMgr7refreshEb + 0x3C5
      [9] 0x00007F0CC922D669 _ZN15hiDataCollector17updateRefreshSetsEv + 0x14D
    i'm loading a bunch of table and this one "ditag" is the one on which the loading script hangs. I suppose the reason is detailed in the above log, but it's a bit cryptic to me...! a quick google search did not yield an obvious answer. anyone would have an idea where to start looking?

    Thanks!

  8. #8
    Join Date
    Aug 2008
    Location
    Toronto, Canada
    Posts
    2,369
    I don't think the errors from db2acd are related to the load. Do you see any additional message for this particular load operation -> LOADID: 20.2011-05-12-10.14.41.762370.0 ?

    msgmnb = 65K may not be enough on some systems / for some load operations. I've seen some load operations failing / hanging even with 256K. While the load is "hanging", execute ipcs -q | grep <instance name> and then sort by column used-bytes in reverse. Here's an example of what I saw with msgmnb = 131072:

    $ more ipcs.sort.by.4.rn.BK


    ------ Message Queues --------
    key msqid owner perms used-bytes messages
    0x00000000 346751138 xxxxxxxx 701 131067 2733
    0x00000000 346751138 xxxxxxxx 701 131067 2733
    ....

    131067 is very close to msgmnb = 131072. In this case, the load was hanging and / or generating errors as mentioned in the technote.

  9. #9
    Join Date
    May 2011
    Posts
    12
    Hi BELLO4K and thanks for helping,

    no other references to that LOADID aside from what was posted earlier.

    my concern about raising msgmnb is that the same load operation use to work in the past with the current value, so i'd rather not resort to raising it for now. that being said, i will rerun my script and monitor it.

    i'll keep you posted!
    thanks again

  10. #10
    Join Date
    May 2011
    Posts
    12
    so i checked ipcs as load is hanging and got this:

    Code:
    ipcs -q | grep db2inst1 | sort -k5n,5n | tac
    0x17ffed9a 111706113  db2inst1   610        8336         2           
    0x00000000 118292501  db2inst1   701        960          20          
    0x00000000 118358039  db2inst1   701        816          17          
    0x00000000 118226963  db2inst1   701        768          16          
    0x5bda20f6 118652959  db2inst1   610        0            0
    it seems well below the 65K threshold..

    this time i skipped the problematic table, but it then hung on the very next table. I checked the db2diag log and essentially the same error message shows than for the other table.

    still not sure where to go from here...!

  11. #11
    Join Date
    Aug 2008
    Location
    Toronto, Canada
    Posts
    2,369
    Does "list utilities show detail" / db2pd -uti show it making any progress at all?

    We need to collect additional diagnostic information during the hang in order to investigate this further. One way to do it is to execute db2fodc -hang documented here: https://www-304.ibm.com/support/docv...id=swg21322385 or at least collect several iterations of call stacks using db2pd -stacks.


    I'd suggest to open a PMR with DB2 Support and they will advise you what information to collect.

  12. #12
    Join Date
    May 2011
    Posts
    12
    the list utilities fails:

    Code:
    $ db2 list utilities show detail
    sh: /bin/db2bpq: No such file or directory
    DB21016E  The Command Line Processor encountered a system error while sending 
    the command to the backend process.
    DB21018E  A system error occurred. The command line processor could not 
    continue processing.
    this is a message i've encountered before when the load hangs, for instance if i simply try to issue a basic SELECT command, the same thing is returned.

    however the db2pd -uti has shown something very interesting:

    Code:
    $ db2pd -uti
    
    Database Partition 0 -- Active -- Up 0 days 18:06:13 -- Date 05/18/2011 10:11:01
    
    Utilities:
    Address            ID         Type                   State      Invoker    Priority   StartTime           DBName   NumPhases  CurPhase   Description         
    0x0000000200BDEB60 13         LOAD                   0          0          0          Tue May 17 16:08:33 C60      2           2           OFFLINE LOAD DEL AUTOMATIC INDEXING INSERT NON-RECOVERABLE C60HSAP .ditag
    
    Progress:
    Address            ID         PhaseNum   CompletedWork                TotalWork                    StartTime           Description           
    0x0000000200BDF448 13         1          0 bytes                      0 bytes                      Tue May 17 16:08:33 SETUPl                
    0x0000000200BDED60 13         2          1323959 rows                 3691982 rows                 Tue May 17 16:08:34 LOAD
    the number of rows actually DOES progress, but very slowly. 10 minutes after issuing the above command, the number of rows was 10000 rows higher. so it's not hanging per se, it just slows down a lot. but it does so way more than i would expect, loading this table use to take about 15 minutes, and a quick estimates now make it about 15 hours..! not to mention the "sh: /bin/db2bpq: No such file or directory" issue mentioned above...

  13. #13
    Join Date
    Aug 2008
    Location
    Toronto, Canada
    Posts
    2,369
    From 15min to 15hrs... So, it's not a true hang but a very slow one. Have you made any recent changes? Load script changes? Do you use pipes ( | ) in the sript? The reason I asked about pipes is because of the following:

    "Piping the output of a command line processor command into another command line processor command is supported. For example: db2 -x <SQL_statement> | db2 +p -tv. This support is limited only by the pipe buffer size. Pipe buffer sizes are not configurable. If the pipe buffer size is exceeded by the first command, the command line processor might hang or fail when attempting to write the output to the pipe buffer. If the second command is not a command line processor command, for example a UNIX shell command, a hang or failure will not occur due to the pipe buffer size limitation."


    IBM DB2 9.7 for Linux, UNIX and Windows Information Center


    On Linux, the pipe buffer size is 4096 and I don't think it can be changed (?)

    $ ulimit -a | grep pipe
    pipe buffer size (bytes) (-p) 4096

    When this buffer is full, it may hang or return that db2bpq message.

    There are also several possible APARs related to db2bpq.

  14. #14
    Join Date
    May 2011
    Posts
    12
    no unix piping used here...

    i issue commands of the sort:

    Code:
    load from /usr/local/dbs/db2/db2inst1/MyProgram/build/dump/homo_sapiens_core_60_37e/db2/mysql___my_server___3306___homo_sapiens_core_60_37e___ditag___db2.dump of del lobs from /usr/local/dbs/db2/db2inst1/MyProgram/build/dump/homo_sapiens_core_60_37e/db2/ modified by lobsinfile coldel0x09 chardel"" delprioritychar insert into "C60HSAP"."ditag" ("ditag_id", "name", "type", "tag_count", "sequence") NONRECOVERABLE;
    there was a pretty bad crash about a few weeks ago, the server had to be hard-rebooted... it could have compromised some essential files, but i would hope db2 is a bit more stable than that...

    maybe a reisntall is necessary?
    Last edited by anthony82; 05-19-11 at 14:19. Reason: forgot [code] tags

  15. #15
    Join Date
    Jun 2003
    Location
    Toronto, Canada
    Posts
    5,516
    Provided Answers: 1
    Quote Originally Posted by anthony82 View Post

    there was a pretty bad crash about a few weeks ago, the server had to be hard-rebooted...
    I would run INSPECT and db2dart to verify tables and tablespaces that you're loading.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •