Re: postgres crashes on insert in 40 different threads

Dzmitry <dzmitry.nikitsin@xxxxxxxxx> · Mon, 19 Aug 2013 11:45:08 +0300

I don't think it's the case. I am using newrelic for monitoring my DB
servers(I have one master and 2 slaves - all use the same configuration) -
memory is not going above 12.5GB, so I have a good reserve, also I don't
see any swapping there :(

Thanks,
  Dzmitry

On 8/19/13 11:36 AM, "Stéphane Schildknecht"
<stephane.schildknecht@xxxxxxxxxxxxx> wrote:

>Le 19/08/2013 10:07, Dzmitry a écrit :
>> Hey folks,
>>  I have postgres server running on ubuntu 12,Intel Xeon 8 CPUs 29 GB
>>RAM.
>> With following settings:
>> max_connections = 550
>> shared_buffers = 12GB
>> temp_buffers = 8MB
>> max_prepared_transactions = 0
>> work_mem = 50MB
>> maintenance_work_mem = 1GB
>> fsync = on
>> wal_buffers = 16MB
>> commit_delay = 50
>> commit_siblings = 7
>> checkpoint_segments = 32
>> checkpoint_completion_target = 0.9
>> effective_cache_size = 22GB
>> autovacuum = on
>> autovacuum_vacuum_threshold = 1800
>> autovacuum_analyze_threshold = 900
>>
>> I am doing a lot of writes to DB in 40 different threads  so every
>>thread 
>> check if record exists  if not => insert record, if exists => update
>>record.
>> During this update, my disk IO almost always  100% and sometimes it
>>crash my 
>> DB with following message:
>>
>> 2013-08-19 03:18:00 UTC LOG:  checkpointer process (PID 28354) was
>>terminated 
>> by signal 9: Killed
>> 2013-08-19 03:18:00 UTC LOG:  terminating any other active server
>>processes
>> 2013-08-19 03:18:00 UTC WARNING:  terminating connection because of
>>crash of 
>> another server process
>> 2013-08-19 03:18:00 UTC DETAIL:  The postmaster has commanded this
>>server 
>> process to roll back the current transaction and exit, because another
>>server 
>> process exited abnormally and possibly corrupted shared memory.
>> 2013-08-19 03:18:00 UTC HINT:  In a moment you should be able to
>>reconnect to 
>> the database and repeat your command.
>> 2013-08-19 03:18:00 UTC WARNING:  terminating connection because of
>>crash of 
>> another server process
>> 2013-08-19 03:18:00 UTC DETAIL:  The postmaster has commanded this
>>server 
>> process to roll back the current transaction and exit, because another
>>server 
>> process exited abnormally and possibly corrupted shared memory.
>> 2013-08-19 03:18:00 UTC HINT:  In a moment you should be able to
>>reconnect to 
>> the database and repeat your command.
>> 2013-08-19 03:18:00 UTC WARNING:  terminating connection because of
>>crash of 
>> another server process
>> 2013-08-19 03:18:00 UTC DETAIL:  The postmaster has commanded this
>>server 
>> process to roll back the current transaction and exit, because another
>>server 
>> process exited abnormally and possibly corrupted shared memory.
>>
>> My DB size is not very big  169GB.
>>
>> Anyone know how can I get rid of DB crash  ?
>>
>>
>> Thanks,
>>   Dzmitry
>>
>
>The fact that the checkpointer was killed -9 let me think the OOMKiller
>has 
>detected you were out of memory.
>
>Could that be the case?
>
>12GB of shared_buffers on a 29Gb box is too high. You should try to lower
>that 
>value to 6GB, for instance.
>550*50MB, that is 27GB of RAM that PostgreSQL could try to adress.
>
>I can imagine your system is swapping a lot, and you exhaust swap memory
>before 
>crash.
>
>Regards,
>
>-- 
>Stéphane Schildknecht
>Loxodata - Conseil, expertise et formations
>
>
>
>-- 
>Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx)
>To make changes to your subscription:
>http://www.postgresql.org/mailpref/pgsql-admin

-- 
Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin