Search Postgresql Archives

Re: UTC4115FATAL: the database system is in recovery mode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05/30/2011 10:29 PM, Mathew Samuel wrote:

2011-03-28 10:44:28 UTC3609HINT: Consider increasing the configuration
parameter "checkpoint_segments".
2011-03-28 10:44:38 UTC3609LOG: checkpoints are occurring too frequently
(10 seconds apart)
2011-03-28 10:44:38 UTC3609HINT: Consider increasing the configuration
parameter "checkpoint_segments".
2011-03-28 10:44:42 UTC3932ERROR: canceling statement due to statement
timeout
2011-03-28 10:44:42 UTC3932STATEMENT: vacuum full analyze _zamboni.sl_log_1
2011-03-28 10:44:42 UTC3932PANIC: cannot abort transaction 1827110275,
it was already committed
2011-03-28 10:44:42 UTC3566LOG: server process (PID 3932) was terminated
by signal 6

Interesting. It almost looks like a VACUUM FULL ANALYZE was cancelled by statement_timeout, couldn't be aborted (assuming it was in fact 1827110275) and then the backend crashed with a signal 6 (SIGABRT). SIGABRT can be caused by an assertion failure, certain fatal aborts in the C library caused by memory allocation errors, etc.

Alas, while PostgreSQL may have dumped a core file I doubt there's any debug information in your build. If you do find a core file for that process ID, it might be worth checking for a debuginfo rpm just in case.

In fact those last 3 lines are repeated over and over again repeatedly
until "UTC4115FATAL: the database system is in recovery mode" is logged
for 4 hours. At some point, 4 hours later of course, it appears that the
system recovers.

Wow. Four hours recovery with default checkpoint settings.

Is it possible that the server was completely overloaded and was swapping heavily? That could explain why VACUUM timed out in the first place, and would explain why it took such a long time to recover. Check your system logs around the same time for other indications of excessive load, and check your monitoring history if you have monitoring like Cacti or the like active.

See if there's anything interesting in the kernel logs too.

Just for completeness, can you send all non-commented-out, non-blank lines in your postgresql.conf ?

$ egrep '^[^#[:space:]]' postgresql.conf |cut -d '#' -f 1

--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux