Re: postgres invoked oom-killer

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Silvio Brandani wrote:
Lacey Powers ha scritto:
Silvio Brandani wrote:
We have a postgres 8.3.8 on linux

We get following messages int /var/log/messages:

May 6 22:31:01 pgblade02 kernel: postgres invoked oom-killer: gfp_mask=0x201d2, order=0, oomkilladj=0
May  6 22:31:01 pgblade02 kernel:
May  6 22:31:01 pgblade02 kernel: Call Trace:
May 6 22:31:19 pgblade02 kernel: [<ffffffff800bed05>] out_of_memory+0x8e/0x2f5 May 6 22:31:19 pgblade02 kernel: [<ffffffff8000f071>] __alloc_pages+0x22b/0x2b4 May 6 22:31:19 pgblade02 kernel: [<ffffffff80012720>] __do_page_cache_readahead+0x95/0x1d9 May 6 22:31:19 pgblade02 kernel: [<ffffffff800618e1>] __wait_on_bit_lock+0x5b/0x66 May 6 22:31:19 pgblade02 kernel: [<ffffffff881fdc61>] :dm_mod:dm_any_congested+0x38/0x3f May 6 22:31:19 pgblade02 kernel: [<ffffffff800130ab>] filemap_nopage+0x148/0x322 May 6 22:31:19 pgblade02 kernel: [<ffffffff800087ed>] __handle_mm_fault+0x1f8/0xdf4 May 6 22:31:19 pgblade02 kernel: [<ffffffff80064a6a>] do_page_fault+0x4b8/0x81d May 6 22:31:19 pgblade02 kernel: [<ffffffff80060f29>] thread_return+0x0/0xeb May 6 22:31:19 pgblade02 kernel: [<ffffffff8005bde9>] error_exit+0x0/0x84
May  6 22:31:27 pgblade02 kernel:
May  6 22:31:28 pgblade02 kernel: Mem-info:
May  6 22:31:28 pgblade02 kernel: Node 0 DMA per-cpu:
May  6 22:31:28 pgblade02 kernel: cpu 0 hot: high 0, batch 1 used:0
May  6 22:31:28 pgblade02 kernel: cpu 0 cold: high 0, batch 1 used:0
May  6 22:31:28 pgblade02 kernel: cpu 1 hot: high 0, batch 1 used:0
May  6 22:31:28 pgblade02 kernel: cpu 1 cold: high 0, batch 1 used:0
May  6 22:31:28 pgblade02 kernel: cpu 2 hot: high 0, batch 1 used:0
May  6 22:31:28 pgblade02 kernel: cpu 2 cold: high 0, batch 1 used:0
May  6 22:31:28 pgblade02 kernel: cpu 3 hot: high 0, batch 1 used:0
May  6 22:31:28 pgblade02 kernel: cpu 3 cold: high 0, batch 1 used:0
May  6 22:31:28 pgblade02 kernel: Node 0 DMA32 per-cpu:
May  6 22:31:28 pgblade02 kernel: cpu 0 hot: high 186, batch 31 used:27
May  6 22:31:29 pgblade02 kernel: cpu 0 cold: high 62, batch 15 used:54
May  6 22:31:29 pgblade02 kernel: cpu 1 hot: high 186, batch 31 used:23
May  6 22:31:29 pgblade02 kernel: cpu 1 cold: high 62, batch 15 used:49
May  6 22:31:29 pgblade02 kernel: cpu 2 hot: high 186, batch 31 used:12
May  6 22:31:29 pgblade02 kernel: cpu 2 cold: high 62, batch 15 used:14
May  6 22:31:29 pgblade02 kernel: cpu 3 hot: high 186, batch 31 used:50
May  6 22:31:29 pgblade02 kernel: cpu 3 cold: high 62, batch 15 used:60
May  6 22:31:29 pgblade02 kernel: Node 0 Normal per-cpu:
May  6 22:31:29 pgblade02 kernel: cpu 0 hot: high 186, batch 31 used:5
May  6 22:31:29 pgblade02 kernel: cpu 0 cold: high 62, batch 15 used:48
May  6 22:31:29 pgblade02 kernel: cpu 1 hot: high 186, batch 31 used:11
May  6 22:31:29 pgblade02 kernel: cpu 1 cold: high 62, batch 15 used:39
May  6 22:31:29 pgblade02 kernel: cpu 2 hot: high 186, batch 31 used:14
May  6 22:31:29 pgblade02 kernel: cpu 2 cold: high 62, batch 15 used:57
May  6 22:31:29 pgblade02 kernel: cpu 3 hot: high 186, batch 31 used:94
May  6 22:31:29 pgblade02 kernel: cpu 3 cold: high 62, batch 15 used:36
May  6 22:31:29 pgblade02 kernel: Node 0 HighMem per-cpu: empty
May 6 22:31:29 pgblade02 kernel: Free pages: 41788kB (0kB HighMem) May 6 22:31:29 pgblade02 kernel: Active:974250 inactive:920579 dirty:0 writeback:0 unstable:0 free:10447 slab:11470 mapped-file:985 mapped-anon:1848625 pagetables:111027 May 6 22:31:29 pgblade02 kernel: Node 0 DMA free:11172kB min:12kB low:12kB high:16kB active:0kB inactive:0kB present:10816kB pages_scanned:0 all_unreclaimable? yes
May  6 22:31:29 pgblade02 kernel: lowmem_reserve[]: 0 3254 8052 8052
May 6 22:31:29 pgblade02 kernel: Node 0 DMA32 free:23804kB min:4636kB low:5792kB high:6952kB active:1555260kB inactive:1566144kB present:3332668kB pages_scanned:35703257 all_unreclaimable? yes
May  6 22:31:29 pgblade02 kernel: lowmem_reserve[]: 0 0 4797 4797
May 6 22:31:29 pgblade02 kernel: Node 0 Normal free:6812kB min:6836kB low:8544kB high:10252kB active:2342332kB inactive:2115836kB present:4912640kB pages_scanned:10165709 all_unreclaimable? yes
May  6 22:31:29 pgblade02 kernel: lowmem_reserve[]: 0 0 0 0
May 6 22:31:29 pgblade02 kernel: Node 0 HighMem free:0kB min:128kB low:128kB high:128kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
May  6 22:31:29 pgblade02 kernel: lowmem_reserve[]: 0 0 0 0
May 6 22:31:29 pgblade02 kernel: Node 0 DMA: 3*4kB 5*8kB 3*16kB 6*32kB 4*64kB 3*128kB 0*256kB 0*512kB 2*1024kB 0*2048kB 2*4096kB = 11172kB May 6 22:31:29 pgblade02 kernel: Node 0 DMA32: 27*4kB 0*8kB 1*16kB 0*32kB 2*64kB 4*128kB 0*256kB 1*512kB 0*1024kB 1*2048kB 5*4096kB = 23804kB
May  6 22:31:29 pgblade02 ker
if it asks for more memory than is actually available.
nel: Node 0 Normal: 21*4kB 9*8kB 26*16kB 3*32kB 6*64kB 5*128kB 0*256kB 0*512kB 1*1024kB 0*2048kB 1*4096kB = 6812kB
May  6 22:31:29 pgblade02 kernel: Node 0 HighMem: empty
May 6 22:31:29 pgblade02 kernel: Swap cache: add 71286821, delete 71287152, find 207780333/216904318, race 1387+10506
May  6 22:31:29 pgblade02 kernel: Free swap  = 0kB
May  6 22:31:30 pgblade02 kernel: Total swap = 8388600kB
May  6 22:31:30 pgblade02 kernel: Free swap:            0kB
May  6 22:31:30 pgblade02 kernel: 2293759 pages of RAM
May  6 22:31:30 pgblade02 kernel: 249523 reserved pages
May  6 22:31:30 pgblade02 kernel: 56111 pages shared
May  6 22:31:30 pgblade02 kernel: 260 pages swap cached
May 6 22:31:30 pgblade02 kernel: Out of memory: Killed process 29076 (postgres).


We get folloowing errors in the postgres log:

A couple of time:
2010-05-06 22:26:28 CEST [23001]: [2-1] WARNING: worker took too long to start; cancelled
Then:
2010-05-06 22:31:21 CEST [29059]: [27-1] LOG: system logger process (PID 29076) was terminated by signal 9: Killed
Finally:
2010-05-06 22:50:20 CEST [29059]: [28-1] LOG: background writer process (PID 22999) was terminated by signal 9: Killed 2010-05-06 22:50:20 CEST [29059]: [29-1] LOG: terminating any other active server processes

Any help higly apprecaited,

---


Hello Silvio,

Is this machine dedicated to PostgreSQL?

If so, I'd recommend adding these two parameters to your sysctl.conf

vm.overcommit_memory = 2
vm.overcommit_ratio = 0

So that OOMKiller is turned off.

PostgreSQL should gracefully degrade if a malloc() fails because it asks for too much memory.

Hope that helps. =)

Regards,
Lacey



Thanks a lot,
yes the server is dedicated to PostgreSQL.

Could be a bug of PostgreSQL the fact that the system went Out of Memory?? Wath can be the cause of it?

Regards,
Silvio


Hello Silvio,

This isn't a bug in PostgreSQL.

OOMKiller is a OS level application, and is designed to free up memory by terminating a low priority process.

So, something filled up the available memory in your server, and OOMKiller then decided to ungracefully terminate PostgreSQL. =(

It's equivalent to sending a kill -9 <pid/> to PostgreSQL, which is not a good thing to do, ever.

If you have sar running (or other system resource logging), and pair that data with other log data, you might be able to get an idea from that, as to what might have caused the out of memory condition, if you're interested.

But, since this is a dedicated machine, if you just add the parameters to your sysctl.conf, this shouldn't happen again. =)

Hope that helps. =)

Regards,

Lacey

--
Lacey Powers

The PostgreSQL Company - Command Prompt, Inc. 1.503.667.4564 ext 104
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


--
Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux