Results 1 to 13 of 13
  1. #1
    Join Date
    Mar 2009
    Posts
    22

    Unanswered: shared memory monitoring

    Hi All,

    We are doing performance testing for our application. It has been devloped in weblogic 10 and informix 10.

    Once 400 users got loaded into the application. the shared memory size was around 7GB [Resident + Virtual]. While we monitor shared memory, every one hour, it is keep on increasing. After 24 hour run, it reached around 12 GB Memory.

    Our maximum capacity of RAM is 15 GB. If it continues like this , it is reaching maximum shared memory.

    My question is:-

    1. Why it is keep on increasing?

    2. After the 400 users logout, the occupied memory 15 GB not coming down to base 7GB memory.

    Is there any parameter to shrink the shared memory size?

    Please advice me on that. It is the urgent one.

    Thanks in Advance.
    R V

  2. #2
    Join Date
    Jun 2009
    Location
    Lisboa, Portugal
    Posts
    78
    What is ths OS where are the informix server instaled?

  3. #3
    Join Date
    Mar 2009
    Posts
    22
    Hi All,

    Operating System version is :-
    HP unix 11.31

    Thanks,
    R V

  4. #4
    Join Date
    Jun 2009
    Location
    Lisboa, Portugal
    Posts
    78
    Hi again,

    Sorry for to do asks again but i need more information... could you send me the results of this commands on that server?

    onstat -g seg

    onstat -l

    Thakx.

  5. #5
    Join Date
    Mar 2009
    Posts
    22
    No problem Luis.

    I can provide the details. This is the major problem for us in performance testing.

    Onstat -g seg output:-

    IBM Informix Dynamic Server Version 10.00.FC8W2 -- On-Line -- Up 11:29:40 -- 7384292 Kbytes

    Segment Summary:
    id key addr size ovhd class blkused blkfree
    8585222 1382107137 c000000004200000 6329466880 618304 R* 1545276 4
    1179655 1382107138 c00000017d640000 1024000000 32000 V 239683 10317
    1179656 1382107139 c0000001ba6d0000 3248128 848 M 792 1
    12713993 1382107140 c0000001ba9ec000 204800000 6992 V 13527 36473
    Total: - - 7561515008 - - 1799278 46795

    (* segment locked in memory)


    Thanks in Advance,
    R V
    Last edited by rvsenthil; 06-16-09 at 18:56.

  6. #6
    Join Date
    Jun 2009
    Location
    Lisboa, Portugal
    Posts
    78
    Hi R V,

    1179655 1382107138 c00000017d640000 1024000000 32000 V 239683 10317
    1179656 1382107139 c0000001ba6d0000 3248128 848 M 792 1
    12713993 1382107140 c0000001ba9ec000 204800000 6992 V 13527 36473

    As you see above your informix server is using two virtual segment size which is normaly becouse the infomix engine allready with 7 GB.
    The vitual memory size is ocupied on disk volume as you difines on online.log

    Why are you so worryed about the allocation of shared memory? You dont have disk space? At any time the system have any problem in responding?

    You have well, i think, nice defined the initial Virtual Memory segment size well defined and the alocation well too with 204800000 for next segment size.

    Did you notice also that the alocated mem stay on the same state after all 400 users enter?

    Thakx.

  7. #7
    Join Date
    Mar 2009
    Posts
    22
    Thanks a lot for your time Luis.

    Whatever you are saying is right. Currently it is occupied 7GB. After the 400 users login to application, it is also occupying the same 7 GB.

    Once 400 users started some operation through application in database, it is started to increase. For example, every hour 200 MB of size is increasing.

    At the end of 24 hour, it will have around 7 virtual segment size. In online log, it is throwing the error like " maximum shared memory reached". At one point of time, the instance is end up with offline mode.

    what could be the cause for this problem?

    Another one thing i noticed; once the weblogic server stopped/shut down also, the allocated shared memory is not getting shrink to base size. it is still occupying 14 GB of size.

    is it right in informix? or it will come to normal base shared memory size.

    Thanks,
    R V

  8. #8
    Join Date
    Jun 2009
    Location
    Lisboa, Portugal
    Posts
    78
    Hi again,

    For the first question, it is not easy to respond the cause for that reaction without analize the online.log and se how is the configuration done in onconfig.x where x is the number of your server and it is placed normaly in etc directory. It seams that you onconfig file has any limit in the number of maximum number of virtual segments to alocate that couses the server PANIC, or your OS has any limitation when are alocated some resources like disk or Temporary dbspace as it is defined on onconfig file.

    Second question, the fact of the allocated shared memory does not come to the base size is normal, i think that only when the server comes down that part is initialized so that is not to worry. But that kind of PANIC situation is not normal.

    If you feel free to send that files i can analize that for you and give you an opinion more realistic.
    I also need you place the result of the command of your OS response to:

    free -m to see the free memory of your system and
    df ou diskfree if your temp space is on the OS or
    onstat -d if your temp space is an dbs which is the most probably

    Thankx.

  9. #9
    Join Date
    Mar 2009
    Posts
    22
    Hi Luis,

    Thanks a lot for the good explanation.

    I couldn't take free -m output. Instead of i have attached swapinfo output.

    Attached Files:-

    onconfig
    df -k output
    swapinfo
    shared_memory configuration

    Today we are running performance testing in the same box.

    while 400 users got loaded, onstat -g seg output was


    IBM Informix Dynamic Server Version 10.00.FC8W2 -- On-Line -- Up 01:18:06 -- 7184292 Kbytes

    Segment Summary:
    id key addr size ovhd class blkused blkfree
    8617990 1382107137 c000000004200000 6329466880 618304 R* 1545276 4
    1212423 1382107138 c00000017d640000 1024000000 32000 V 168320 81680
    1212424 1382107139 c0000001ba6d0000 3248128 848 M 792 1
    Total: - - 7356715008 - - 1714388 81685


    After 3 hours, onstat -g seg output is

    IBM Informix Dynamic Server Version 10.00.FC8W2 -- On-Line -- Up 03:38:08 -- 7584292 Kbytes

    Segment Summary:
    id key addr size ovhd class blkused blkfree
    8617990 1382107137 c000000004200000 6329466880 618304 R* 1545276 4
    1212423 1382107138 c00000017d640000 1024000000 32000 V 170480 79520
    1212424 1382107139 c0000001ba6d0000 3248128 848 M 792 1
    13008905 1382107140 c0000001c6e7c000 204800000 6992 V 24371 25629
    4915211 1382107141 c0000001d31cc000 204800000 6992 V 7736 42264
    Total: - - 7766315008 - - 1748655 147418

    once again, thanks for your time.

    Regards,
    R V
    Attached Files Attached Files

  10. #10
    Join Date
    Jun 2009
    Location
    Lisboa, Portugal
    Posts
    78
    Hi R V,

    Now i need some time to analize this data. But i asked if you can send me the output command for "onstat -d" too and please output to me also "tail -10000 online.log".


    I could see that you are doing Mirror in this instance but you dont specify more anything for that... is necessary do mirror with tools of informix for this instace... is just a curiosity? Becouse usualy we leave this for the OS to take care...

    Other clue you are saying that this instance uses 10 CPUs VPI´s for multi-processing... is that right?
    Another clue you have 11 or more instaces on this physical server?
    This is only becouse if this is a production server i think you have many instances... am i rigth?
    Other thing that i didn´t saw, i dont that there is an entry for checkpoint interval, did i saw wrong? It should not be more then 300s, for this instace i think you better put 180s...

    Please see those questions and send me please the output for "onstat -d" and "tail -10000 online.log".


    Bye.
    LS

  11. #11
    Join Date
    Mar 2009
    Posts
    22
    Hi,

    sorry for the delayed reply.

    Please see those questions and send me please the output for "onstat -d" and "tail -10000 online.log".

    In mirroring, we are not going to keep mirroring in OS level. not in informix level.
    This instance has only two instances. we shutdown the other instance while we run performance testing.

    yes , you are right about checkpoint interval.

    I have attached onstat and online log.

    Thanks for your time Luis.
    Attached Files Attached Files

  12. #12
    Join Date
    Jun 2009
    Location
    Lisboa, Portugal
    Posts
    78
    Hi again,

    You have some things i dont understand at you log file...

    Can you try to explain?

    1.
    "16:17:06 Warning: Invalid (non-existent/blobspace/disabled) dbspace listed
    in DBSPACETEMP: 'gema_temp_dbs7'"

    The server are asking for a DBSPACETEMP that does not exists you can see at your "onstat -d" file...

    2.
    16:16:54 HP KAIO concurrent requests changed from 1000 to 3000

    How many KAIO´s have you reserved for this server from the OS?

    3.
    "16:25:29 DR: Failure recovery error (2)
    16:25:30 DR: Turned off on primary server
    16:25:30 DR: Cannot connect to secondary server
    16:25:45 DR: Primary server connected
    16:25:46 DR: Two servers out of sync.
    HDR secondary server will shut down.
    To re-establish the HDR pair, perform a backup of the primary
    server, and then restore that backup onto the secondary server.
    To convert the HDR pair into two standard servers, restart the
    secondary server using the oninit -S command and
    convert the primary server into a standard server using the
    onmode -d standard command.
    16:25:46 DR: Failure recovery error (2)
    16:25:47 DR: Turned off on primary server
    16:25:47 DR: Cannot connect to secondary server
    16:28:12 Shutdown Mode
    16:28:13 Quiescent Mode
    "

    Do you have replication HDR on this server for onother one? Is this the Primary Server?

    4.
    "14:24:40 DR: Failure recovery error (2)
    14:24:42 DR: Turned off on primary server
    14:24:42 DR: Cannot connect to secondary server
    14:24:53 DR: Primary server connected
    14:24:53 DR: Send error
    14:24:53 ASF Echo-Thread Server: asfcode = -25580: oserr = 32: errstr = : System error occurred in network function.
    System error = 32.
    14:24:53 DR: Failure recovery error (2)
    14:24:55 DR: Turned off on primary server
    14:24:55 DR: Cannot connect to secondary server
    14:25:07 DR: Primary server connected
    14:25:07 DR: Secondary server needs failure recovery
    "

    I can see here that you are constantly having problems on the network or in your HDR replication. Is that rigth?


    If i was you for now i will done this:

    1. Discover why he is asking for a DBSPACETEMP which does not exists.

    2. In onconfig file where is:
    "BUFFERPOOL size=2K,buffers=2500000,lrus=512,lru_min_dirty=1.0 00000,lru_max_dirty=2.000000"
    increse the bufers for 3500000 and put lru_max_dirty=1.500000

    3. If you can in this time, stop HDR replication if it can be possible, you can see you have falures recovery errors that can incresses your VM and the server somehow goes down.

    If with this mesures the problems still´s remain you send me please the result of your online.log like this time.

    OK?

    bye
    Last edited by Luis Santos; 06-20-09 at 20:08.

  13. #13
    Join Date
    Mar 2009
    Posts
    22
    Thanks a lot Luis.

    Last one week, my performance testing server was down for some other activity. I will answer your queries with full details once it is up.

    Thanks
    Senthil

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •