Re: Help! PostgreSQL stuck at starting up after crash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I don't know how to make sure if WAL logs corrupted. 
At the end of the recovery in postgresql log I saw 

2012-01-18 18:30:58.570 MST     3666 - LOG:  consistent recovery state reached at 56C/CD0AFE00
2012-01-18 18:30:58.587 MST     3666 - LOG:  recovery stopping before abort of transaction 541802043, time 2012-01-18 12:50:08.531615-07
2012-01-18 18:30:58.587 MST     3666 - LOG:  redo done at 56C/CD226C58
2012-01-18 18:30:58.587 MST     3666 - LOG:  last completed transaction was at log time 2012-01-18 12:49:28.321605-07
2012-01-18 18:30:58.589 MST     3666 - LOG:  selected new timeline ID: 2
2012-01-18 18:30:59.187 MST     3666 - LOG:  archive recovery complete

just nothing happened after that and postgresql is stuck at starting up and not getting out of archive recovery mode.
at that time there is no cpu/disk activities and it seemed like it's waiting for something? 

Fortunately this is a development/test database and we don't have any backup plan on it as data loss is not a big issue. 
In production environment we do set up stream replication and on going backup for both db and WAL logs.
However, this do raise some worries, especially my impression is that postgresql shall protect the data pretty well.

I will give it a try on pg_resetxlog, Thanks for the heads-up.


On Thu, Jan 19, 2012 at 7:30 AM, David Hornsby <david@xxxxxxxxxxxxx> wrote:
Sounds like you have a corrupt wal files that you will have to reset the
wal logs with pgresetxlog.

http://www.postgresql.org/docs/8.2/static/app-pgresetxlog.html

This will result in missing transactions so before you do this shutdown
postgres and make a copy of the database files first. That way if you
don't like what happens you can always go back to the way things were.

Also right now would be a good time to evaluate your backup strategy,
which is a different topic for a different thread, but I can certainly
help with that as well.

-David Hornsby

> version Postgresql 9.1.1 on centos5 x64
>
> We experience slow performance and found the server is running 3 vacuum
> process on the same db which use up 99% of CPU.
> Then we kill -9 one of those process which cause postgresql to crash and
> it
> tried to restart after the crash
> However when the starting process reach the last WAL files, it just stuck
> there
>
> pg_controldata shows the db is in Archive Recovery mode and when using
> psql
> to connect the db, it says FATAL: the database system starting up.
>
> I took a chance and upgrade to PostgreSql 9.1.2 and see if anything
> changed
> it still stuck at the end of recovery.
> pg_controldata shows db is in Crash recovery, but that probably different
> wording I think
> using psql to connect the db, it says FATAL: the database system is
> starting up.
>
> I pretty much run out of idea here.
> Can anyone help what to go from here?
>
> Samuel
>





--

Shian-Miin Samuel Hwang | Software Developer | Phone 1-403-2626519 (ext. 276) | Fax 1-403-233-8046

Replicon | Hassle-Free Time & Expense Management Software - 7,300 Customers - 70 Countries
www.replicon.com | facebook | twitter | blog | contact us

We are hiring! | search jobs



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux