Re: postgres crashes on insert in 40 different threads

Stéphane Schildknecht <stephane.schildknecht@xxxxxxxxxxxxx> · Mon, 19 Aug 2013 10:36:39 +0200

Le 19/08/2013 10:07, Dzmitry a écrit :
Hey folks,
 I have postgres server running on ubuntu 12,Intel Xeon 8 CPUs 29 GB RAM.
With following settings:
max_connections = 550
shared_buffers = 12GB
temp_buffers = 8MB
max_prepared_transactions = 0
work_mem = 50MB
maintenance_work_mem = 1GB
fsync = on
wal_buffers = 16MB
commit_delay = 50
commit_siblings = 7
checkpoint_segments = 32
checkpoint_completion_target = 0.9
effective_cache_size = 22GB
autovacuum = on
autovacuum_vacuum_threshold = 1800
autovacuum_analyze_threshold = 900

I am doing a lot of writes to DB in 40 different threads – so every thread 
check if record exists – if not => insert record, if exists => update record.
During this update, my disk IO almost always – 100% and sometimes it crash my 
DB with following message:

2013-08-19 03:18:00 UTC LOG:  checkpointer process (PID 28354) was terminated 
by signal 9: Killed
2013-08-19 03:18:00 UTC LOG:  terminating any other active server processes
2013-08-19 03:18:00 UTC WARNING:  terminating connection because of crash of 
another server process
2013-08-19 03:18:00 UTC DETAIL:  The postmaster has commanded this server 
process to roll back the current transaction and exit, because another server 
process exited abnormally and possibly corrupted shared memory.
2013-08-19 03:18:00 UTC HINT:  In a moment you should be able to reconnect to 
the database and repeat your command.
2013-08-19 03:18:00 UTC WARNING:  terminating connection because of crash of 
another server process
2013-08-19 03:18:00 UTC DETAIL:  The postmaster has commanded this server 
process to roll back the current transaction and exit, because another server 
process exited abnormally and possibly corrupted shared memory.
2013-08-19 03:18:00 UTC HINT:  In a moment you should be able to reconnect to 
the database and repeat your command.
2013-08-19 03:18:00 UTC WARNING:  terminating connection because of crash of 
another server process
2013-08-19 03:18:00 UTC DETAIL:  The postmaster has commanded this server 
process to roll back the current transaction and exit, because another server 
process exited abnormally and possibly corrupted shared memory.

My DB size is not very big – 169GB.

Anyone know how can I get rid of DB crash  ?

Thanks,
  Dzmitry

The fact that the checkpointer was killed -9 let me think the OOMKiller has 
detected you were out of memory.

Could that be the case?

12GB of shared_buffers on a 29Gb box is too high. You should try to lower that 
value to 6GB, for instance.
550*50MB, that is 27GB of RAM that PostgreSQL could try to adress.

I can imagine your system is swapping a lot, and you exhaust swap memory before 
crash.

Regards,

--
Stéphane Schildknecht
Loxodata - Conseil, expertise et formations

--
Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin