Hi!
Sorry for the quick updates to my own messages, but I didn't want to lean back and wait - so I took to more aggressive measures. All my other databases in this cluster are fine - and the 'postgres' database doesn't seem to do anything really useful except being the default database. I dropped it and recreated it with template1 as template, afterwards I could start up my cluster with no problems whatsoever. I'd still like to find out what exactly happened here so I can prevent the same from happening again in the future. The age(datfrozenxid) is positive again:
# SELECT datname, age(datfrozenxid), datfrozenxid FROM pg_database where datname='postgres';
datname | age | datfrozenxid
----------+-----------+--------------
postgres | 100291695 | 3882762765
(1 Zeile)
datname | age | datfrozenxid
----------+-----------+--------------
postgres | 100291695 | 3882762765
(1 Zeile)
As I mentioned earlier, I'm running autovaccuum and use a nightly cron to run vacuum verbose analyze over all my databases. So lack of vacuum cannot be the issue, I think. But what else could have happened here? I regularly scan my logs, and there was no early warning for this issue.
The first event of this type in the server log was from today:
<2008-03-21 17:08:48 CET - 32161: xxx.xxx.xxx.xxx(52833)@magazine>WARNUNG: Datenbank »postgres« muss innerhalb von 11000000 Transaktionen gevacuumt werden
<2008-03-21 17:08:48 CET - 32161: xxx.xxx.xxx.xxx(52833)@magazine>TIPP: Um ein Abschalten der Datenbank zu vermeiden, führen Sie in »postgres« ein VACUUM über die komplette Datenbank aus.
<2008-03-21 17:08:48 CET - 32161: xxx.xxx.xxx.xxx(52833)@magazine>TIPP: Um ein Abschalten der Datenbank zu vermeiden, führen Sie in »postgres« ein VACUUM über die komplette Datenbank aus.
(i.e. database 'postgres' need to be vacuumed within 11000000 transactions...)
A mere three hours later, the server already refused any further requests:
<2008-03-21 20:05:21 CET - 25184: xxx.xxx.xxx.xxx(60837)@magazine>FEHLER: Datenbank nimmt keine Befehle an, um Datenverlust in Datenbank »postgres« wegen Transaktionsnummernüberlauf zu vermeiden
(ie. database no longer accepts any commands in order to prevent data loss in database 'postgres' because of transaction id wraparound)
Now that the adrenaline level has dropped to normal, I'd still like to know what exactly has happened here; The cluster has been initdb'ed on 2007-04-27.
Kind regards
Markus
Computec Media AG
Sitz der Gesellschaft und Registergericht: Fürth (HRB 8818)
Vorstandsmitglieder: Johannes S. Gözalan (Vorsitzender) und Niels Herrmann
Vorsitzender des Aufsichtsrates: Jürg Marquard
Umsatzsteuer-Identifikationsnummer: DE 812 575 276