On Tue, Sep 10, 2013 at 10:37 AM, Chris Curvey <ccurvey@xxxxxxxxxxxxxxxxxx> wrote:
Great thought. Looking through the logs, it appears that all my failures are on a CREATE INDEX. Usually on my biggest table, but often on another table.
2013-09-10 10:09:46 EDT ERROR: canceling autovacuum task
2013-09-10 10:09:46 EDT CONTEXT: automatic analyze of table "certified_mail_ccc2.public.cm_status_history"
2013-09-10 10:15:13 EDT LOG: server process (PID 14386) was terminated by signal 11: Segmentation fault
2013-09-10 10:15:13 EDT DETAIL: Failed process was running: CREATE INDEX cm_envelope_tracking_number ON cm_envelope USING btree (tracking_number);
2013-09-10 10:15:13 EDT LOG: terminating any other active server processes
2013-09-10 10:15:13 EDT WARNING: terminating connection because of crash of another server process
2013-09-10 10:15:13 EDT DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
I cannot square this with the fact that when I echo the commands, the last echoed command is about setting privileges.
A backend is crashing, and taking down the entire PostgreSQL system. The commands you see being echoed are from a different process from the one that triggered the crash, so it is just an innocent bystander which has no useful information. Are you using parallel restore? (If not, why is there someone indexing your biggest table during the restore?)
You will want to get the backtrace of the coredump generated by the crashed backend, not of the running process. Have you tried taking a bt with gdb? You said you couldn't find the symbols, but have you tried it anyway? On CentOS and openSuse I often get warnings about some symbols not being found, but all the symbols I actually need to interpret the backtrace end up being there.
Cheers,Jeff
Many thanks!