Results 1 to 10 of 10
  1. #1
    Join Date
    Jun 2009
    Posts
    272

    Unanswered: system temp full

    I am in DB2 V9.5 on AIX

    I am seeing this error in db2diag.log

    2010-06-02-12.30.34.742679-240 E2818315A968 LEVEL: Error (OS)
    PID : 622596 TID : 66864 PROC : db2sysc 0
    INSTANCE: rptyesnz NODE : 000 DB : TYESPROD
    APPHDL : 0-40467 APPID: 10.2.200.64.56518.100602163010
    AUTHID : TS109B
    EDUID : 66864 EDUNAME: db2agent (TYESPROD) 0
    FUNCTION: DB2 UDB, oper system services, sqloseekwrite64, probe:40
    MESSAGE : ZRC=0x850F000C=-2062614516=SQLO_DISK "Disk full."
    DIA8312C Disk was full.
    CALLED : OS, -, pwrite
    OSERR : ENOSPC (28) "No space left on device"
    DATA #1 : File handle, PD_TYPE_SQO_FILE_HDL, 8 bytes
    0x070000092DFF5820 : 0000 0134 0000 0208 ...4....
    DATA #2 : unsigned integer, 8 bytes
    4096
    DATA #3 : signed integer, 8 bytes
    15938510848
    DATA #4 : signed integer, 8 bytes
    -1
    DATA #5 : String, 105 bytes
    Search for ossError*Analysis probe point after this log entry for further
    self-diagnosis of this problem.

    2010-06-02-12.30.34.743717-240 I2819284A2722 LEVEL: Error (OS)
    PID : 622596 TID : 66864 PROC : db2sysc 0
    INSTANCE: rptyesnz NODE : 000 DB : TYESPROD
    APPHDL : 0-40467 APPID: 10.2.200.64.56518.100602163010
    AUTHID : TS109B
    EDUID : 66864 EDUNAME: db2agent (TYESPROD) 0
    FUNCTION: DB2 Common, OSSe, ossErrorIOAnalysis, probe:100
    CALLED : OS, -, pwrite
    OSERR : ENOSPC (28) "No space left on device"
    DATA #1 : String, 147 bytes
    A total of 5 analysis will be performed :
    - User info
    - ulimit info
    - Target file info
    - I/O attempt
    - File system

    2010-06-02-12.30.34.744251-240 E2822007A833 LEVEL: Error
    PID : 622596 TID : 66864 PROC : db2sysc 0
    INSTANCE: rptyesnz NODE : 000 DB : TYESPROD
    APPHDL : 0-40467 APPID: 10.2.200.64.56518.100602163010
    AUTHID : TS109B
    EDUID : 66864 EDUNAME: db2agent (TYESPROD) 0
    FUNCTION: DB2 UDB, buffer pool services, sqlbWritePageToDisk, probe:20
    MESSAGE : ADM6017E The table space "TYESSYSTEMTEMP" (ID "1") is full. Detected
    on container "/udb_tyes2_stemp/sms" (ID "0"). The underlying file
    system is full or the maximum allowed space usage for the file system
    has been reached. It is also possible that there are user limits in
    place with respect to maximum file size and these limits have been
    reached.


    "/udb_tyes2_stemp/sms" is the container of the system temp TYESSYSTEMTEMP. I understand that system tempspace got filled up. But is there any way to determine which application handle was responsible for it. How can we determine the temp space used by each application. I am sure the application handle " 0-40467 " which is present in that entry in diag.log was not responsible for filling up the system temp as it was just doing a small insert.

    This is the only one entry which I pasted here. There are several entries during that short period of time with the same errors but with different application handles. So I am trying to determine which one was the culprit.

    Thanks in advance

  2. #2
    Join Date
    Aug 2008
    Location
    Toronto, Canada
    Posts
    2,369
    You can check the size of TDA files in /udb_tyes2_stemp/sms and get snapshot for tables. In the snapshot output, look for TEMP (x,y) where x is the tablespace id and y is the table id (y should identify the TDA file). Table schema will give you the application handle. In pre-v9.5 versions, I did fuser on the TDA file to get a list of PIDs using it and then looked for a db2agent.

  3. #3
    Join Date
    Jun 2009
    Posts
    272
    The issue is not happening currently. I only came to know after it has happened. and once the tempspace was filled up, all the applications got rolled back with 968 errors.

    And so I am trying to find out what could have caused it. We collect these information for every 15 minutes

    ts.snapshot.dat0602101445
    dbm.snapshot.dat0602101445
    db.snapshot.dat0602101445
    tbl.snapshot.dat0602101445.Z
    lock.snapshot.dat0602101445.Z
    db2pd.snapshot.dat
    bufferpools.snapshot.dat0602101445
    appl.snapshot.dat0602101445.Z
    db2mtrk.dat0602101445
    db.config.dat060210

    topmemps.snapshot.dat0602101445
    topcpups.snapshot.dat0602101445


    We also collect top 25 table reads and top table writes for every 15 minutes.

    Do you think I can use any of this info to find out the error.


    The only thing which I noticed was from the tablespace snapshot, the total pages for the system temp has started increasing for one hour until it reached 100%. But I can't figure out which is the culprit application.

  4. #4
    Join Date
    May 2003
    Location
    USA
    Posts
    5,737
    Many SQL statements that need to do sorts of large result sets may be creating temporary tables in the temporary tablespace. Reorgs may also need to use temporary tablespaces. You just need to increase the amount of free disk space available for the temporary tablespace.
    M. A. Feldman
    IBM Certified DBA on DB2 for Linux, UNIX, and Windows
    IBM Certified DBA on DB2 for z/OS and OS/390

  5. #5
    Join Date
    Jun 2009
    Posts
    272
    I know that. But this situation never happened in last 10 years. We have 15 GB of space allocated to temp file system and this happened for the first time. Not more than 8 GB will be used at the max. And so I am trying to determine the culprit application based on the snapshots I have.

  6. #6
    Join Date
    Aug 2008
    Location
    Toronto, Canada
    Posts
    2,369
    In the table snapshot, do you see any TEMP tables with tablespace id 1?

  7. #7
    Join Date
    Jun 2009
    Posts
    272
    I don't see any temp tablespace ID's in the table snapshot. The only ID's I could see are Reorg Tablespace IDs.


    I am also attaching couple of tablesnapshots which were taken when the tempspace was growing.
    Attached Files Attached Files
    Last edited by blazer789; 06-02-10 at 17:46.

  8. #8
    Join Date
    Jun 2009
    Posts
    272
    Finally I captured the culprit from our top table read outputs which we take every 15 minutes. This temp table was the top

    TABSCHEMA TABNAME ROWS_WRITTEN SNAPSHOT_TIMESTAMP
    ------------------ ----------------------------------- -------------------- --------------------------
    <53853><GPTUX03 > TEMP (00001,00035) 127072968 2010-06-02-12.30.04.766977

  9. #9
    Join Date
    Aug 2008
    Location
    Toronto, Canada
    Posts
    2,369
    and it looks like the only one using this tempspace at the time snapshots were taken:

    Table Schema = <53853><GPTUX03 >
    Table Name = TEMP (00001,00031)
    Table Type = Temporary
    Data Object Pages = 694
    Rows Read = 64153566
    Rows Written = 34681
    Overflows = 0
    Page Reorgs = 0


    Table Schema = <53853><GPTUX03 >
    Table Name = TEMP (00001,00035)
    Table Type = Temporary
    Data Object Pages = 2069463
    Rows Read = 0
    Rows Written = 64153353
    Overflows = 0
    Page Reorgs = 0



    Table Schema = <53853><GPTUX03 >
    Table Name = TEMP (00001,00031)
    Table Type = Temporary
    Data Object Pages = 694
    Rows Read = 94959875
    Rows Written = 34681
    Overflows = 0
    Page Reorgs = 0


    Table Schema = <53853><GPTUX03 >
    Table Name = TEMP (00001,00035)
    Table Type = Temporary
    Data Object Pages = 3063214
    Rows Read = 0
    Rows Written = 94959634
    Overflows = 0
    Page Reorgs = 0

  10. #10
    Join Date
    Jun 2009
    Posts
    272
    You are awesome as you always are !!!!!!

    Thanks Bella !!!!!!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •