thanks Vladimir and Emanuel
>How many connections (active and max) do you have? max_connections = 2000 Active 1000
>Are you sure that nothing else could eat memory (e.g. some poorly-written cronjob?) There are 2 or 3 ligth cronjob's but the free command show free memory... why? :s
>How many VIRT memory does postmaster have before the crash? 8518m
>Here is some suggersions: >Check memory limits in /etc/security/limits.conf and in `ulimit -a`. The config file is the default [postgres@SERVER ~]$ ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 268287 max locked memory (kbytes, -l) 32 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 268287 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited
>tail /var/log/audit/audit.log (it might be something interesting there especially if you are running selinux) selinux is disabled
>Try to upgrade to 8.1.17 (it should be safe and fast operation). I will update
>Check your kernerl's shm settings. kernel.shmmax = 68719476736 kernel.shmall = 4294967296 Are the default, i think are too big already
>Check fsm settings. max_fsm_pages = 180000
regards...
Date: Sat, 1 Aug 2009 12:20:27 +0400 Subject: Re: out of memory From: vladimir@xxxxxxxxxxxxxx To: fabrixio1@xxxxxxxxxxx CC: pgsql-admin@xxxxxxxxxxxxxx; alvherre@xxxxxxxxxxxxxxxxx
On Sat, Aug 1, 2009 at 1:20 AM, Fabricio <fabrixio1@xxxxxxxxxxx> wrote:
Hi
Some one know why this is happening?
I change the OS to 64 bits and now the oom-killer not hapend but Postgres is still showing out of memory
Linux SERVER 2.6.18-92.el5 #1 SMP Tue Apr 29 13:16:15 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux Server release 5.2 (Tikanga) PostgreSQL 8.1.15 32GB RAM
My postgresql.conf: # - Memory -
shared_buffers = 1048576 # min 16 or max_connections*2, 8KB each
temp_buffers = 1024 # min 100, 8KB each max_prepared_transactions = 20 # can be 0 or more # note: increasing max_prepared_transactions costs ~600 bytes of shared memory # per transaction slot, plus lock space (see max_locks_per_transaction).
work_mem = 1024 # min 64, size in KB maintenance_work_mem = 65536 # min 1024, size in KB max_stack_depth = 2048 # min 100, size in KB
kernel messages: Jul 31 11:50:08 SERVER kernel: postmaster[7686]: segfault at 00007fff3feb1bb0 rip 00002b2f7e17e1a8 rsp 00007fff3feb1b90 error 6
Jul 31 15:41:55 SERVER kernel: postmaster[4737]: segfault at 00007fff3feb1bb0 rip 00002b2f7e1851a8 rsp 00007fff3feb1b90 error 6
PostgreSQL log: <2009-07-31 15:41:55 MDT 7253 > LOG: could not fork new process for connection: Cannot allocate memory
<2009-07-31 15:41:55 MDT 7253 > LOG: could not fork new process for connection: Cannot allocate memory <2009-07-31 15:41:55 MDT 10.27.41.74(2606) aforeglobal sysaforeglobal 7423 startup> FATAL: out of memory
<2009-07-31 15:41:55 MDT 7253 > LOG: could not fork new process for connection: Cannot allocate memory <2009-07-31 15:41:55 MDT 7253 > LOG: could not fork new process for connection: Cannot allocate memory
<2009-07-31 15:41:55 MDT 10.27.36.219(3859) db user 7424 startup> FATAL: out of memory <2009-07-31 15:41:55 MDT 10.20.216.82(1966) db user 7431 startup> FATAL: out of memory TopMemoryContext: 164432 total in 6 blocks; 5368 free (1 chunks); 159064 used
MdSmgr: 0 total in 0 blocks; 0 free (0 chunks); 0 used LockTable (locallock hash): 8192 total in 1 blocks; 3744 free (0 chunks); 4448 used Timezones: 52560 total in 2 blocks; 3744 free (0 chunks); 48816 used ErrorContext: 8192 total in 1 blocks; 8160 free (4 chunks); 32 used
<2009-07-31 15:41:55 MDT 10.33.128.38(4458) db user 7434 startup> FATAL: out of memory <2009-07-31 15:41:55 MDT 10.33.128.38(4458) db user 7434 startup> DETAIL: Failed on request of size 24000. TopMemoryContext: 164432 total in 6 blocks; 5368 free (1 chunks); 159064 used
MdSmgr: 0 total in 0 blocks; 0 free (0 chunks); 0 used LockTable (locallock hash): 8192 total in 1 blocks; 3744 free (0 chunks); 4448 used Timezones: 52560 total in 2 blocks; 3744 free (0 chunks); 48816 used ErrorContext: 8192 total in 1 blocks; 8160 free (4 chunks); 32 used
<2009-07-31 15:41:55 MDT 10.33.128.38(4459) db user 7435 startup> FATAL: out of memory <2009-07-31 15:41:55 MDT 10.33.128.38(4459) db user 7435 startup> DETAIL: Failed on request of size 24000. TopMemoryContext: 164432 total in 6 blocks; 5368 free (1 chunks); 159064 used
MdSmgr: 0 total in 0 blocks; 0 free (0 chunks); 0 used LockTable (locallock hash): 8192 total in 1 blocks; 3744 free (0 chunks); 4448 used Timezones: 52560 total in 2 blocks; 3744 free (0 chunks); 48816 used ErrorContext: 8192 total in 1 blocks; 8160 free (4 chunks); 32 used
<2009-07-31 15:41:55 MDT 10.33.128.38(4460) db user 7436 startup> FATAL: out of memory <2009-07-31 15:41:55 MDT 10.33.128.38(4460) db user 7436 startup> DETAIL: Failed on request of size 24000. <2009-07-31 15:41:55 MDT 10.33.128.38(4461) db user 7438 startup> FATAL: out of memory
<2009-07-31 15:41:55 MDT 10.33.128.38(4462) db user 7439 startup> FATAL: out of memory <2009-07-31 15:41:55 MDT 10.33.128.38(4463) db user 7440 startup> FATAL: out of memory <2009-07-31 15:41:55 MDT 10.20.219.194(3594) db user 7433 startup> FATAL: out of memory
<2009-07-31 15:41:55 MDT 10.33.128.38(4464) db user 7441 startup> FATAL: out of memory <2009-07-31 15:41:55 MDT 10.33.128.38(4465) db user 7442 startup> FATAL: out of memory <2009-07-31 15:41:55 MDT 10.33.128.31(1263) db user 7447 startup> FATAL: out of memory
<2009-07-31 15:41:55 MDT 10.44.5.43(3498) db user 7450 startup> FATAL: out of memory <2009-07-31 15:41:55 MDT 10.27.36.219(3860) db user 7448 startup> FATAL: out of memory <2009-07-31 15:41:55 MDT 10.33.128.10(35976) db user 7460 idle> LOG: unexpected EOF on client connection
<2009-07-31 15:41:55 MDT 7253 > LOG: server process (PID 4737) was terminated by signal 11 <2009-07-31 15:41:55 MDT 7253 > LOG: terminating any other active server processes <2009-07-31
15:41:55 MDT 10.33.128.10(35976) db user 7460 idle> WARNING:
terminating connection because of crash of another server process
Before crash:
date & free -m Fri Jul 31 15:40:01 MDT 2009 15:40:01 up 4:48, 3 users, load average: 2.64, 3.04, 3.58
total used free shared buffers cached Mem: 32187 22292 9895 0 164 19824 -/+ buffers/cache: 2303 29884 Swap: 1983 0 1983
After crash:
date & free -m Fri Jul 31 15:45:01 MDT 2009 15:45:01 up 4:53, 3 users, load average: 4.45, 3.99, 3.80
total used free shared buffers cached
Mem: 32187 14726 17460 0 165 13850 -/+ buffers/cache: 710 31477 Swap: 1983 0 1983 Hm, It looks weird. BTW, I assume that 24000 in "DETAIL: Failed on request of size 24000" is size in bytes, but may be it's size in kilobytes? o_O
How many connections (active and max) do you have? Are you sure that nothing else could eat memory (e.g. some poorly-written cronjob?) How many VIRT memory does postmaster have before the crash?
Here is some suggersions:
Check memory limits in /etc/security/limits.conf and in `ulimit -a`. tail /var/log/audit/audit.log (it might be something interesting there especially if you are running selinux) Try to upgrade to 8.1.17 (it should be safe and fast operation).
Check your kernerl's shm settings. Check fsm settings.
HTH
-- Vladimir Rusinov http://greenmice.info/
Únete a la celebración de Messenger y sigue siendo parte de esta historia
|