Hey folks, I have postgres server running on ubuntu 12, Intel Xeon 8 CPUs 29 GB RAM. With following settings: max_connections = 550 shared_buffers = 12GB temp_buffers = 8MB max_prepared_transactions = 0 work_mem = 50MB maintenance_work_mem = 1GB fsync = on wal_buffers = 16MB commit_delay = 50 commit_siblings = 7 checkpoint_segments = 32 checkpoint_completion_target = 0.9 effective_cache_size = 22GB autovacuum = on autovacuum_vacuum_threshold = 1800 autovacuum_analyze_threshold = 900 I am doing a lot of writes to DB in 40 different threads – so every thread check if record exists – if not => insert record, if exists => update record. During this update, my disk IO almost always – 100% and sometimes it crash my DB with following message: 2013-08-19 03:18:00 UTC LOG: checkpointer process (PID 28354) was terminated by signal 9: Killed 2013-08-19 03:18:00 UTC LOG: terminating any other active server processes 2013-08-19 03:18:00 UTC WARNING: terminating connection because of crash of another server process 2013-08-19 03:18:00 UTC DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2013-08-19 03:18:00 UTC HINT: In a moment you should be able to reconnect to the database and repeat your command. 2013-08-19 03:18:00 UTC WARNING: terminating connection because of crash of another server process 2013-08-19 03:18:00 UTC DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2013-08-19 03:18:00 UTC HINT: In a moment you should be able to reconnect to the database and repeat your command. 2013-08-19 03:18:00 UTC WARNING: terminating connection because of crash of another server process 2013-08-19 03:18:00 UTC DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. My DB size is not very big – 169GB. Anyone know how can I get rid of DB crash ? Thanks, Dzmitry |