Thank you guys ! Found in logs(db-slave1 is my replica that use streaming replication): Aug 18 15:49:38 db-slave1 kernel: [25094456.525703] postgres invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0 Aug 18 15:49:38 db-slave1 kernel: [25094456.525708] postgres cpuset=/ mems_allowed=0 Aug 18 15:49:38 db-slave1 kernel: [25094456.525712] Pid: 26418, comm: postgres Not tainted 3.2.0-40-virtual #64-Ubuntu Aug 18 15:49:48 db-slave1 kernel: [25094456.621937] Out of memory: Kill process 2414 (postgres) score 417 or sacrifice child Aug 18 15:49:48 db-slave1 kernel: [25094456.621949] Killed process 2414 (postgres) total-vm:13057464kB, anon-rss:28560kB, file-rss:12773376kB Aug 19 03:18:00 db-slave1 kernel: [25135758.540539] postgres invoked oom-killer: gfp_mask=0x280da, order=0, oom_adj=0, oom_score_adj=0 Aug 19 03:18:01 db-slave1 kernel: [25135758.540544] postgres cpuset=/ mems_allowed=0 Aug 19 03:18:07 db-slave1 kernel: [25135758.540548] Pid: 23994, comm: postgres Not tainted 3.2.0-40-virtual #64-Ubuntu Aug 19 03:18:07 db-slave1 kernel: [25135758.626405] Out of memory: Kill process 28354 (postgres) score 348 or sacrifice child Aug 19 03:18:07 db-slave1 kernel: [25135758.626418] Killed process 28354 (postgres) total-vm:13021248kB, anon-rss:8704kB, file-rss:10686512kB Aug 19 03:18:07 db-slave1 kernel: [25135763.068736] postgres invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0 Aug 19 03:18:07 db-slave1 kernel: [25135763.068740] postgres cpuset=/ mems_allowed=0 Aug 19 03:18:07 db-slave1 kernel: [25135763.068743] Pid: 6780, comm: postgres Not tainted 3.2.0-40-virtual #64-Ubuntu Aug 19 03:18:07 db-slave1 kernel: [25135763.150285] Out of memory: Kill process 20322 (postgres) score 301 or sacrifice child Aug 19 03:18:07 db-slave1 kernel: [25135763.150297] Killed process 20322 (postgres) total-vm:13058892kB, anon-rss:47172kB, file-rss:9186604kB So I will do as Stéphane advised - make shared buffers 6GB. Do you know if I need to do anything else - increase shared memory(SHMMAX/SHMMIN) parameters ? Right now I have Shhmax - 13223870464 Shmall - 4194304 Thanks, Dzmitry On 8/19/13 1:05 PM, "Albe Laurenz" <laurenz.albe@xxxxxxxxxx> wrote: >>> Dzmitry wrote: >>>> On 8/19/13 11:36 AM, "Stéphane Schildknecht" >>>><stephane.schildknecht@xxxxxxxxxxxxx> wrote: >>>>> Le 19/08/2013 10:07, Dzmitry a écrit : >>>>>> I have postgres server running on ubuntu 12,Intel Xeon 8 CPUs 29 GB >>>>>>RAM. >>>>>> With following settings: >>>>>> max_connections = 550 >>>>>> shared_buffers = 12GB >>>>>> temp_buffers = 8MB >>>>>> max_prepared_transactions = 0 >>>>>> work_mem = 50MB >>>>>> maintenance_work_mem = 1GB >>>>>> fsync = on >>>>>> wal_buffers = 16MB >>>>>> commit_delay = 50 >>>>>> commit_siblings = 7 >>>>>> checkpoint_segments = 32 >>>>>> checkpoint_completion_target = 0.9 >>>>>> effective_cache_size = 22GB >>>>>> autovacuum = on >>>>>> autovacuum_vacuum_threshold = 1800 >>>>>> autovacuum_analyze_threshold = 900 >>>>>> >>>>>> I am doing a lot of writes to DB in 40 different threads so every >>>>>>thread >>>>>> check if record exists if not => insert record, if exists => >>>>>>update >>>>>>record. >>>>>> During this update, my disk IO almost always 100% and sometimes it >>>>>>crash my >>>>>> DB with following message: >>>>>> >>>>>> 2013-08-19 03:18:00 UTC LOG: checkpointer process (PID 28354) was >>>>>>terminated by signal 9: Killed >>>[...] >>>>>> >>>>>> My DB size is not very big 169GB. >>>>>> >>>>>> Anyone know how can I get rid of DB crash ? >>> >>>>> The fact that the checkpointer was killed -9 let me think the >>>>>OOMKiller has >>>>> detected you were out of memory. >>>>> >>>>> Could that be the case? >>>>> >>>>> 12GB of shared_buffers on a 29Gb box is too high. You should try to >>>>>lower that >>>>> value to 6GB, for instance. >>>>> 550*50MB, that is 27GB of RAM that PostgreSQL could try to adress. >>>>> >>>>> I can imagine your system is swapping a lot, and you exhaust swap >>>>>memory before crash. >>> >>>> I don't think it's the case. I am using newrelic for monitoring my DB >>>> servers(I have one master and 2 slaves - all use the same >>>>configuration) - >>>> memory is not going above 12.5GB, so I have a good reserve, also I >>>>don't >>>> see any swapping there :( >>> >>> You can check by examining /var/log/messages to see if the OOM >>> killer is at fault, which is highly likely. >>> >>> The OOM killer uses heuristics, so it does the wrong thing >>>occasionally. >>> >>> The documentation is helpful: >>> >>>http://www.postgresql.org/docs/9.2/static/kernel-resources.html#LINUX-ME >>>MORY-OVERCOMMIT > >> Do you mean postgres log file(in postgres.conf) >> >> log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' >> log_min_messages = warning >> >> Or /var/log/messages ? Because I haven't this file :( > >I meant the latter. >/var/log/messages is just where syslog output is directed on some >Linux distributions. I don't know Ubuntu, so sorry if I got >it wrong. Maybe it is /var/log/syslog on Ubuntu. >In case of doubt check your syslog configuration. > >Yours, >Laurenz Albe -- Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-admin