Hello,
I have postgres 9.3.4 running on linux, and ~20 databases in the cluster.
All the cluster was migrated from 9.2 using pg_upgradecluster.
After migration autovacuum started to fail in one database, causing entire cluster crashes:
2014-07-13 21:16:24 MSK [5665]: [1-1] db=,user= PANIC: corrupted item pointer: offset = 5292, size = 24
2014-07-13 21:16:24 MSK [29131]: [417-1] db=,user= LOG: server process (PID 5665) was terminated by signal 6: Aborted
2014-07-13 21:16:24 MSK [29131]: [418-1] db=,user= DETAIL: Failed process was running: autovacuum: VACUUM public.postfix_stat0 (to prevent wraparound)
2014-07-13 21:16:24 MSK [29131]: [419-1] db=,user= LOG: terminating any other active server processes
2014-07-13 21:16:24 MSK [29597]: [1-1] db=,user= WARNING: terminating connection because of crash of another server process
I have two questions:
1) why in case of some problem with only one database, only one place of memory we have entire-server problem? The database with problem is not important but this corrupted memory inside it leads to frequent cluster-wide restart so all my server suffering from this local problem.
Why postmaster should restart all backends if only one dies?
2) what is the best modern way to analyze and fix such an issue?
Thank you.