Results 1 to 14 of 14
  1. #1
    Join Date
    Jan 2013
    Posts
    5

    Unanswered: Db2 Database Marked BAD

    In a single month, I was forced to face 3 crashes in my productions db. Database is sitting on DB2v9.5.8 32 bit , on a VM. The database has grown recently and when there is a stored procedure is running on a particular table the inconsistency happens.

    What the procedure is doing is the following;

    insert into DISK_MASTER (host_name,DISK)
    select distinct upper(scope.sanitize_hostname(HOST_NAME)),UPPER(ca se when DISPLAYNAME is null then '' else DISPLAYNAME end) from
    DISK_TADDM
    where host_name is not null and TYPE in('IDE disk','SCSI disk') and in_rollup=0
    EXCEPT select upper(host_name),UPPER(disk) from DISK_MASTER;--

    the child table got around 2 crore data.

    Attaching the error screen shots for expert advice. Now the db2diag.log is flooding with the error

    2013-01-13-15.27.13.573000+660 I958809771H539 LEVEL: Error
    PID : 2120 TID : 7484 PROC : db2syscs.exe
    INSTANCE: DB2 NODE : 000 DB : NABDB
    APPHDL : 0-5857 APPID: *LOCAL.DB2.130108061034
    AUTHID : DB2ADMIN
    EDUID : 7484 EDUNAME: db2agent (NABDB) 0
    FUNCTION: DB2 UDB, data protection services, sqlpgWaitForLrecBufferLimit, probe:
    410
    MESSAGE : Ignore rc:
    DATA #1 : Hexdump, 4 bytes
    0x042F5B00 : 5101 0F87 Q...

    I have a backup, but would like to fix this than a restore.. pls advice ur expert opinion.
    Attached Thumbnails Attached Thumbnails 1.png   2.png   3.png   4.png   5.png  


  2. #2
    Join Date
    Mar 2003
    Posts
    280
    You should probably open a PMR, if nothing else just to confirm what the error origins from.

    Meanwhile, some reflections. Running 9.5+ on 32 bit architecture will severely limit the amount of memory db2 can use. It is not recommended for production use. In combination with STMM (if you use that) I would expect all kind of weird behaviour.

    your insert statement can be improved. Except is an elegant operator, but often leads to poor performance. I would aim for NOT EXISTS or MERGE in this case.

    Code:
    insert into DISK_MASTER (host_name,DISK)
    select distinct upper(scope.sanitize_hostname(HOST_NAME)),UPPER(case when DISPLAYNAME is null then '' else DISPLAYNAME end) from
    DISK_TADDM x
    where host_name is not null and TYPE in('IDE disk','SCSI disk') and in_rollup=0
      and not exists (
         select 1 from DISK_MASTER y
         where ...
      )
    
    or 
    
    merge into DISK_MASTER x
    using (
        select ... from DISK_TADDM where ...
    ) y (...)
    on ...
    when not matched then
        insert ...
    Functions such as UPPER in predicates (will be the case if you convert your query), is bad for performance. It will lead to table or index scans. If - for example - DISK_MASTER.host_name is in upper case, add a check constraint such as check ( host_name = upper(host_name) ). In many cases the optimizer will be able to deduce that the function call is redundant and remove it. Still a good idea to remove the function from the predicate in the query though. If there are host_names that are not in UPPER you can add a generated column:

    host_name_upper generated always as ( upper(hostname) ) and index that column

    DB2 will generate as check constraint such as the one above, and in many cases it will be able to deduce that upper(host_name) can be substituted with host_name_upper
    --
    Lennart

  3. #3
    Join Date
    Jun 2003
    Location
    Toronto, Canada
    Posts
    5,516
    Provided Answers: 1
    Quote Originally Posted by thelightening View Post
    the child table got around 2 crore data.
    Which of the two is the child? Crore -- is it the same as неведий, or more like леодр?

    With your DISTINCTs and UPPERs you will be doing table scans all the way, no matter how you write your statement, so you'll need all of the puny 2 GB memory you get on a 32-bit Windows system.

    Find the root cause error in db2diag.log. Dumps in the FODC directory, referred to in the error message, might give you more information.
    ---
    "It does not work" is not a valid problem statement.

  4. #4
    Join Date
    Jan 2013
    Posts
    5
    Thanks for sharing the information, guys !!

    The child table contains thaw huge amount of data. I have tried to verify the fodc dumps. There is a txt file generated which is highlighting towards locks. The txt file name is locklist. So I believe locks on data rows has utilized the memory, which in turn is the reason behind inconsistency.

    But as per Microsoft, if the PAE is enabled, it should use memory more than 4GB right ?
    http://msdn.microsoft.com/en-us/libr...(v=vs.85).aspx (http://msdn.microsoft.com/en-us/libr...=vs.85%29.aspx)


    db2diag.log is still used by db2 to write the same error message I shared. It's still getting flooded. Since log is around 2gb, I am not able to open it with editors. I am looking to fix the database issue first then to work on the SQL. Any suggestion on handing or recovering the Db?
    Last edited by thelightening; 01-13-13 at 18:56.

  5. #5
    Join Date
    Aug 2008
    Location
    Toronto, Canada
    Posts
    2,368
    Execute the following:

    - db2diag -A (this command will rename your existing db2diag.log)
    - If you can restart the instance, do db2stop/db2start
    - db2 connect to <db name>
    - Attach new db2diag.log here


    If you're eligible for db2 support, your best option to get this resolved is to open a pmr.

  6. #6
    Join Date
    Jan 2013
    Posts
    5
    Hi Guys !!

    Update for the day: I have copied the db2diag.log from the server to a local hdd, its almost 5 GB now

    In parrellel, I have used intrupted the db2sys from logging the same error message in to the db2diag.log

    -----------------

    2013-01-13-15.27.13.573000+660 I958809771H539 LEVEL: Error
    PID : 2120 TID : 7484 PROC : db2syscs.exe
    INSTANCE: DB2 NODE : 000 DB : NABDB
    APPHDL : 0-5857 APPID: *LOCAL.DB2.130108061034
    AUTHID : DB2ADMIN
    EDUID : 7484 EDUNAME: db2agent (NABDB) 0
    FUNCTION: DB2 UDB, data protection services, sqlpgWaitForLrecBufferLimit, probe:
    410
    MESSAGE : Ignore rc:
    DATA #1 : Hexdump, 4 bytes
    0x042F5B00 : 5101 0F87 Q...
    --------------------

    Now the database recovery mode is initiated and going on..not sure about the end result. Keep my fingers crossed.

    DB2 is also saying its enterprise server edition 9.5.8 not have any memory restriction and Microsoft is saying its 2003 enterprise edition -pae , is not having any memory restriction, but still db2 is able to use only up to 3gb

    PMR is planning to open, raised issues at the managerial level for the same.

  7. #7
    Join Date
    Mar 2003
    Posts
    280
    Quote Originally Posted by thelightening View Post
    [...]
    DB2 is also saying its enterprise server edition 9.5.8 not have any memory restriction and Microsoft is saying its 2003 enterprise edition -pae , is not having any memory restriction, but still db2 is able to use only up to 3gb
    The restriction lies in the memory architecture for 9.5+. Earlier version used processes, 9.5 and above uses threads instead. More or less all memory is adressed via one process, and in a 32 bit instance this means that 4Gb is the maximum amount of memory that can be used. The online documentation for 9.5 is down at the moment, so I can't point you there. Once up you should be able to find info there describing this limitation.
    --
    Lennart

  8. #8
    Join Date
    Jan 2013
    Posts
    15
    You need to open a PMR for this issue.

    A brief look, shows that sqlpgWaitForLrecBufferLimit db2 is waiting to write the log buffer to disk and times out.

    To make it easier to diagnose, I recommend the following:

    1) db2stop force
    2) db2diag -A
    3) db2start
    4) db2 connect to NABDB

    Then take a copy of the db2diag.log before too many other applications try to connect to the database and flood the db2diag.log. You will need to look for the "crash recovery" messages to see the logging start and then identify the first error that occurs as my suspicion is that the message you posted is merely a symptom.


    Cheers,
    Hans

  9. #9
    Join Date
    Jan 2013
    Posts
    5
    Thanks again for all inputs guys !!

    I have recovered the database today morning and took a full offline backup of the same. PMR is going to be raised for understanding the root cause and why the memory is not going beyond 3GB on 32 bit Windows 2003.

    There is no perfect documentation on this limitation, and Lelle12 mentioned, may the documentation is down. But need a perfect pointer on this

    Now the issue is the tables are huge and millions of records are present. I am about to do some tuning apart from the AUTOMATIC key.

    Self tuning memory (SELF_TUNING_MEM) = ON
    Size of database shared memory (4KB) (DATABASE_MEMORY) = AUTOMATIC(399820)
    Database memory threshold (DB_MEM_THRESH) = 10
    Max storage for lock list (4KB) (LOCKLIST) = AUTOMATIC(50432)
    Percent. of lock lists per application (MAXLOCKS) = AUTOMATIC(97)
    Package cache size (4KB) (PCKCACHESZ) = AUTOMATIC(2001)
    Sort heap thres for shared sorts (4KB) (SHEAPTHRES_SHR) = AUTOMATIC(20472)

    Sort list heap (4KB) (SORTHEAP) = AUTOMATIC(4094)
    Database heap (4KB) (DBHEAP) = 1268
    Catalog cache size (4KB) (CATALOGCACHE_SZ) = (MAXAPPLS*5)
    Log buffer size (4KB) (LOGBUFSZ) = 98
    Utilities heap size (4KB) (UTIL_HEAP_SZ) = 15340
    Buffer pool size (pages) (BUFFPAGE) = 250
    SQL statement heap (4KB) (STMTHEAP) = 16384
    Default application heap (4KB) (APPLHEAPSZ) = 4096
    Application Memory Size (4KB) (APPL_MEMORY) = AUTOMATIC(10000)
    Statistics heap size (4KB) (STAT_HEAP_SZ) = 8768

    These parameters are planned to be tuned. Please let me know any experienced suggestion you have based on my current configurations.

  10. #10
    Join Date
    Jan 2013
    Posts
    15
    You are stuck with the 3GB limit for DB2 with /AWE:

    All instance shared memory, database shared memory and agent private memory have to fit into the 2GB user space limit, or the 3GB limit on Advanced Server using the /3GB switch in the boot.ini file. 64GB with AWE support.

    With 32-bit memory structure, no matter how big the physical RAM is, the instance, database configurations are limited by the 4GB addressable space. However, if you have sufficient RAM, you may run multiple instances or databases on the system concurrently; just that each of them has to conform to the above limits. To overcome this limitation, you should consider switching to DB2 64-bit.

  11. #11
    Join Date
    Jun 2003
    Location
    Toronto, Canada
    Posts
    5,516
    Provided Answers: 1
    Quote Originally Posted by UglyBoxer View Post
    if you have sufficient RAM, you may run multiple instances or databases on the system concurrently; just that each of them has to conform to the above limits. To overcome this limitation, you should consider switching to DB2 64-bit.
    I believe that on a 32-bit Windows system all user space programs, including DB2, share the 2 (or 3) GB space, so creating more than one instance will probably make things worse, not better.
    ---
    "It does not work" is not a valid problem statement.

  12. #12
    Join Date
    Jan 2013
    Posts
    15
    Quote Originally Posted by n_i View Post
    I believe that on a 32-bit Windows system all user space programs, including DB2, share the 2 (or 3) GB space, so creating more than one instance will probably make things worse, not better.
    The issue, is not the available memory on the server. However, it is the size of the address space that db2 can use on a 32bit platform.

    Having multiple instances will allow multiple db2sysc.exe to have 3GB of address space, if the server has multiple databases. This will allow better performance by not constaining all the databases to one instance and hence a 3GB limit.

  13. #13
    Join Date
    Jun 2003
    Location
    Toronto, Canada
    Posts
    5,516
    Provided Answers: 1
    OK, I'm sure the author of this topic will find the truth in the Microsoft manuals.
    ---
    "It does not work" is not a valid problem statement.

  14. #14
    Join Date
    Jan 2013
    Posts
    5
    Thanks for all your inputs guys !!

    I am in the process of understanding the product limitations. For sure the SQL's needs to be tuned.

    Will update in final once I am done. Thanks once again.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •