Page 1 of 2 12 LastLast
Results 1 to 15 of 17
  1. #1
    Join Date
    Jan 2003
    Posts
    19

    Unanswered: ORACLE Memory Problems (Avoiding OOM on overcommit)

    Hi all,

    I have Oracle 10g installed on a Linux Box with 512 MB of RAM.
    I'm doing test to stress the memory, but I can't limit the impact of my attacks and the machine is going down without control each time that I launch my memory stress script.

    The script consist in launching 100 sessions asking for 10000-100000 rows from a join select that attacks to big tables.
    I've setted sga_max_size=256M and sga_target=256
    It doesn't matter how I set PGA_AGGREGATE_TARGET, the behaviour it's the same. Now it's setted at 16KB.
    I've also setted up the parameter SGA_PRIVATE_LIMIT on the profile that all users belong.

    The machine gets all the memory space and remains for a while swapping processes, but at the end she begins to kill processes with the message on the subject:

    Avoiding OOM on overcommit
    0 Pages of HIGHMEM
    Killing process ....

    The machine dies slowly! After finishing with that I have to restart it.

    I've read something about the kernel parameters, but I would like to know why processes are going aout of memory when my SGA is half of the total memory. I suposse that each process ask for it's PGA space and that is what is fulling all the space, but it does not change with the PGA parameters.

    Can anybody help me with this problem?

    Thank you in advance!

  2. #2
    Join Date
    Jan 2004
    Posts
    370
    How much swap space do you have configured?

    Also, 10g will have workarea_size_policy=auto by default.
    Make sure it hasn't been set to manual.

    .

  3. #3
    Join Date
    Jan 2003
    Posts
    19
    The machine was already configured, but it has 512 MB of swap. I want to resolve the problem with this conditions changing Oracle's memory parameters. But it's really difficult.
    I will try touching that parameter.

    Thanks a lot!

  4. #4
    Join Date
    Jan 2003
    Posts
    19
    Hi again,

    The workarea_size_policy it's setted to auto.
    I've changed the pga_aggregate to the rest of the memory available (250MB) left by the SGA, that is setted to 256MB.
    Now PMON dies and the database goes down with the next error:
    ##################
    Errors in file /data/oracle/rdbms/log/test_pmon_1778.trc:
    ORA-00470: LGWR process terminated with error
    Thu Dec 8 09:59:07 2005
    PMON: terminating instance due to error 470
    Termination issued to instance processes. Waiting for the processes to exit
    Instance terminated by PMON, pid = 1778
    #################

    In the file test_pmon_1778.trc there's nothing interesting, only a memory dump, but I don't understand it...

    Any ideas?

    Thanks a lot!

  5. #5
    Join Date
    Jan 2004
    Posts
    370
    Quote Originally Posted by mmolowny
    The machine was already configured, but it has 512 MB of swap.

    You should have at least 1GB of swap space configured for 512MB of RAM (ie. double the RAM, at LEAST).

    .

  6. #6
    Join Date
    Jan 2003
    Posts
    19
    Yes, I know that, but in any case, ORACLE should control his processes to don't let them go out of memory and make the machine go down.
    I don't understand...
    Each process asks for his PGA space, I don't know if from the kernel parameters I can reach some solution, but it should be done from ORACLE memory settings. It's not normal, isn't it?

    thanks

  7. #7
    Join Date
    Jan 2004
    Posts
    370
    Quote Originally Posted by mmolowny
    I've changed the pga_aggregate to the rest of the memory available (250MB) left by the SGA, that is setted to 256MB.
    Just to make sure I understand.

    Are you saying you have set pga_aggregate_target to 250MB?
    And you have set sga_max_size to 256MB?


    .

  8. #8
    Join Date
    Jan 2003
    Posts
    19
    Yes that's it.

    pga_aggregate_target=250M
    sga_max_size=256M

    But I'm also testing with without the automatic managment of PGA and it's the same.
    The problem is that Oracle allocates in memory processes that wouldn't have to be allocated because there is no more memory space. Even if I use the hidden parameter _pga_max_size(not supported!), nothing happens....the machine dies with OOM-killing process. I know I can set the kernel to avoid this process to do nothing, but the problem is that Oracle should not do this kind of things.

    Thanks!

  9. #9
    Join Date
    Jan 2004
    Posts
    370
    Your system is obviously running out of memory but are you saying that Oracle is taking more memory than you have told it to take? The PGA memory is growing unbounded? If so, how are you measuring the amount of PGA memory Oracle is allocating?

    How do you know if Oracle has broken the pga_aggregate_target (or even reached it) before the memory is exhausted? Oracle will continue to allocate memory up to the limits you have specified, even if there is not enough memory on the system.

    What is the output of vmstat:

    1. When Oracle isn't running.
    2. When Oracle has been started (but before you have any sessions running)

    .

  10. #10
    Join Date
    Jan 2003
    Posts
    19
    I don't know if it has reached the pga limit or not, but the system is dedicated to oracle, so I've setted the oracle memory parameters to ocupy all the system memory. The problem is that when I launch this script, the system goes down with the oom-kill process killing everything. There must be a memory problem. I'm only running my oracle instance, with the listener, and my script, with top and vmstat to see when I have to push the reset button.
    Actually I'm testing with sga_max_size=400M, sga_targe=400M and pga_aggregate_target=100M and it's the samething.

    /var/log/messages when it crashes:

    Dec 9 15:33:14 pcitdb21 kernel: Mem-info:
    Dec 9 15:33:15 pcitdb21 kernel: ZoneMA freepages: 2895 min: 0 low: 0 high: 0
    Dec 9 15:33:15 pcitdb21 kernel: Zone:Normal freepages: 509 min: 510 low: 2237 high: 3228
    Dec 9 15:33:15 pcitdb21 kernel: Zone:HighMem freepages: 0 min: 0 low: 0 high: 0
    Dec 9 15:33:15 pcitdb21 kernel: Free pages: 3404 ( 0 HighMem)
    Dec 9 15:33:15 pcitdb21 kernel: ( Active: 100403/1499, inactive_laundry: 0, inactive_clean: 0, free: 3404 )
    Dec 9 15:33:15 pcitdb21 kernel: aa:0 ac:0 id:0 il:0 ic:0 fr:2895
    Dec 9 15:33:15 pcitdb21 kernel: aa:99447 ac:972 id:1484 il:0 ic:0 fr:509
    Dec 9 15:33:15 pcitdb21 kernel: aa:0 ac:0 id:0 il:0 ic:0 fr:0
    Dec 9 15:33:15 pcitdb21 kernel: 3*4kB 4*8kB 3*16kB 3*32kB 2*64kB 2*128kB 1*256kB 1*512kB 0*1024kB 1*2048kB 2*4096kB = 11580kB)
    Dec 9 15:33:15 pcitdb21 kernel: 3*4kB 7*8kB 1*16kB 9*32kB 2*64kB 0*128kB 0*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 2036kB)
    Dec 9 15:33:15 pcitdb21 kernel: Swap cache: add 214485, delete 211083, find 15073/42229, race 0+1
    Dec 9 15:33:15 pcitdb21 kernel: 4190 pages of slabcache
    Dec 9 15:33:16 pcitdb21 kernel: 662 pages of kernel stacks
    Dec 9 15:33:16 pcitdb21 kernel: 15066 lowmem pagetables, 0 highmem pagetables
    Dec 9 15:33:16 pcitdb21 kernel: 0 bounce buffer pages, 0 are on the emergency list
    Dec 9 15:33:16 pcitdb21 kernel: Free swap: 0kB
    Dec 9 15:33:17 pcitdb21 kernel: 131069 pages of RAM
    Dec 9 15:33:17 pcitdb21 kernel: 0 pages of HIGHMEM
    Dec 9 15:33:17 pcitdb21 kernel: 3502 reserved pages
    Dec 9 15:33:18 pcitdb21 kernel: 17484 pages shared
    Dec 9 15:33:18 pcitdb21 kernel: 3493 pages swap cached
    Dec 9 15:33:18 pcitdb21 kernel: Out of Memory: Killed process 5166 (java).
    Dec 9 15:35:38 pcitdb21 kernel: Mem-info:
    Dec 9 15:35:38 pcitdb21 kernel: ZoneMA freepages: 2895 min: 0 low: 0 high: 0
    Dec 9 15:35:38 pcitdb21 kernel: Zone:Normal freepages: 509 min: 510 low: 2237 high: 3228
    Dec 9 15:35:38 pcitdb21 kernel: Zone:HighMem freepages: 0 min: 0 low: 0 high: 0
    Dec 9 15:35:38 pcitdb21 kernel: Free pages: 3404 ( 0 HighMem)
    Dec 9 15:35:38 pcitdb21 kernel: ( Active: 101835/220, inactive_laundry: 0, inactive_clean: 0, free: 3404 )
    Dec 9 15:35:38 pcitdb21 kernel: aa:0 ac:0 id:0 il:0 ic:0 fr:2895
    Dec 9 15:35:38 pcitdb21 kernel: aa:100786 ac:1008 id:261 il:0 ic:0 fr:509
    Dec 9 15:35:38 pcitdb21 kernel: aa:0 ac:0 id:0 il:0 ic:0 fr:0
    Dec 9 15:35:38 pcitdb21 kernel: 3*4kB 4*8kB 3*16kB 3*32kB 2*64kB 2*128kB 1*256kB 1*512kB 0*1024kB 1*2048kB 2*4096kB = 11580kB)
    Dec 9 15:35:38 pcitdb21 kernel: 3*4kB 3*8kB 3*16kB 9*32kB 2*64kB 0*128kB 0*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 2036kB)
    Dec 9 15:35:38 pcitdb21 kernel: Swap cache: add 216738, delete 213346, find 15408/43183, race 0+1
    Dec 9 15:35:39 pcitdb21 kernel: 4025 pages of slabcache
    Dec 9 15:35:43 pcitdb21 kernel: 638 pages of kernel stacks
    Dec 9 15:35:45 pcitdb21 kernel: 15035 lowmem pagetables, 0 highmem pagetables
    Dec 9 15:35:45 pcitdb21 kernel: 0 bounce buffer pages, 0 are on the emergency list
    Dec 9 15:35:46 pcitdb21 kernel: Free swap: 0kB


    vmstat when ORACLE instance is not running:
    procs memory swap io system cpu
    r b swpd free inact active si so bi bo in cs us sy id wa
    0 0 0 307 69 79 0 0 240 38 74 68 3 2 87 9
    0 0 0 307 69 79 0 0 0 0 104 47 0 0 100 0
    0 0 0 307 69 79 0 0 0 0 104 50 0 0 100 0
    0 0 0 307 69 79 0 0 0 0 104 51 0 0 100 0
    0 0 0 307 69 79 0 0 0 0 104 55 0 0 100 0
    0 0 0 307 69 79 0 0 0 40 105 60 0 0 100 0
    0 0 0 307 69 79 0 0 0 0 104 48 0 0 100 0
    0 0 0 307 69 79 0 0 0 0 103 45 0 0 100 0
    0 0 0 307 69 79 0 0 0 0 104 49 0 0 100 0
    0 0 0 307 69 79 0 0 0 0 103 50 0 0 100 0
    0 0 0 307 69 79 0 0 0 16 107 60 0 0 100 0
    0 0 0 307 69 79 0 0 0 0 103 51 0 0 100 0

    vmstat with oracle instance running:

    r b swpd free inact active si so bi bo in cs us sy id wa
    0 0 0 16 108 307 0 0 139 17 72 73 2 1 91 6
    0 0 0 16 108 307 0 0 0 80 107 128 0 0 99 1
    0 0 0 16 108 307 0 0 0 48 109 115 0 0 100 0
    0 0 0 16 108 307 0 0 32 0 107 118 0 0 99 0
    0 0 0 15 109 308 0 0 432 0 159 152 3 0 90 6
    0 0 0 15 109 308 0 0 0 48 108 117 0 0 100 0
    0 0 0 15 109 308 0 0 0 164 139 119 0 0 99 1
    0 0 0 15 109 308 0 0 0 0 103 106 0 0 100 0
    0 0 0 15 109 308 0 0 0 48 109 118 0 0 100 0


    Thanks a lot!

  11. #11
    Join Date
    Aug 2003
    Location
    Where the Surf Meets the Turf @Del Mar, CA
    Posts
    7,776
    Provided Answers: 1
    >Dec 9 15:33:16 pcitdb21 kernel: Free swap: 0kB
    >Dec 9 15:35:46 pcitdb21 kernel: Free swap: 0kB
    The OS has run out of swap space & that fact is NOT Oracle's fault, but the Unix SA's fault (PEBKAC).
    You can lead some folks to knowledge, but you can not make them think.
    The average person thinks he's above average!
    For most folks, they don't know, what they don't know.
    Good judgement comes from experience. Experience comes from bad judgement.

  12. #12
    Join Date
    Jan 2003
    Posts
    19
    PEBKAC is clear from the begining, it's always there when I'm around :-)
    But I don't understand anyways...If I've setted the memory limits for Oracle (now I've 256M for SGA and 100M for PGA_AGGREGATE) why should it run out of memory...why are these limits for?
    And why should it be an OS problem?
    I've checked out the kernel parameters and they seem to be ok...

  13. #13
    Join Date
    Jan 2004
    Posts
    370
    It looks like you only have about 16MB to work with after Oracle starts. You only have 512MB of RAM but you are allowing the SGA up to 400MB and up to a further 100MB of PGA. There is nothing left for the O/S. You should either add more memory or run with smaller SGA and PGA settings. You should also certainly add more swap space (which should help avoid the system killing off processes). The easiest way is just to add a swap file (you may know this but I've included it below anyway).

    If you have 1GB free disk space on a partition somewhere you can just add a swap file.

    cd to the partition and then run the following:

    dd if=/dev/zero of=swapfile bs=1024 count=1048576

    chmod 600 swapfile

    mkswap swapfile

    swapon swapfile

    You can add an entry to the /etc/fstab file so that it picks it up at boot time.

    <Full path to swapfile>/swapfile swap swap defaults 0 0

  14. #14
    Join Date
    Jan 2003
    Posts
    19
    OK people!
    This another error I got right now...the machine is up, but the instance is dead:

    The parameters are the same:
    SGA=256M
    PGA_AGGREGATE_TARGET=100M

    alert.log:

    Fri Dec 9 16:38:34 2005
    Errors in file /data/oracle/rdbms/log/_pmon_1919.trc:
    ORA-00470: LGWR process terminated with error
    Fri Dec 9 16:38:49 2005
    PMON: terminating instance due to error 470
    Termination issued to instance processes. Waiting for the processes to exit
    Fri Dec 9 16:39:03 2005
    Instance terminated by PMON, pid = 1919

    pmon trace file:
    .....
    error 470 detected in background process
    ORA-00470: LGWR process terminated with error

    All processes give the next error line:
    Can not execute SQL STATEMENTORA-03113 end-of-file on communication channel ...if the instance is down is difficult to communicate, isn't it?

    Any ideas?
    Thanks!

  15. #15
    Join Date
    Jan 2003
    Posts
    19
    thanks sywriter but the machine is full and I have to fix this problems with this conditions because I don't have time. Even if the swap is like that, ORACLE should not occupy all the memory...or not?

    Thanks!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •