Kakoli Sen wrote:
Hello all,
It was running fine initially and the database was lying idle for a
few days. Today I looged into the machine and restarted the server by
killing the process by 'kill -9 pid'. And then restarted it by
'postmaster -i -D /opt/pgsql/data/'.
Why did you use `kill -9' ? Was it not responding to `kill -15' ( ie
SIGTERM, kill -TERM ) or shutdown using the init script?
SIGKILL, ie signal 9, terminates the process without giving it a chance
to clean its state up. It gets no chance to write out buffered data,
mark data files as clean, or take any other safe shutdown actions. It's
a REALLY REALLY BAD IDEA to do this on a database server, though it
should still be able to recover if it's configured to operate with fsync
enabled etc.
Then it gives the following error on stdout :
LOG: database system was interrupted at 2008-03-06 14:15:17 IST
LOG: record with incorrect prev-link 1/0 at 0/A4EB08
LOG: invalid primary checkpoint record
LOG: record with incorrect prev-link 42FD/0 at 0/A4EAC8
LOG: invalid secondary checkpoint record
PANIC: could not locate a valid checkpoint record
Ouch. It can't handle either of the checkpoints, and so it can't load
the database.
I don't know what database repair tools exist, but personally at this
point I'd be glad my backups are always kept up to date.
What is the problem? It was running fine all this time.
I suspect that killing it without giving it a chance to do any cleanup
operations might not have helped.
What's your server configuration? Could you have disabled any safe I/O
options to get some more speed out of the database, perhaps?
I'm pretty sure 8.x copes with SIGKILL (because of its use of WAL
logging, strong fsync requirements, etc) though of course it's still not
a good idea. I don't know about 7.x .
--
Craig Ringer
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general