Search Postgresql Archives

Re: postmaster fails to start

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



hi,
1) when the postmaster was started the first time, it was just a matter of .pid file not being erased, since the machine was restarted. There was no other postmaster running.
2) all the WAL configurations are as default:
#---------------------------------------------------------------------------
# WRITE AHEAD LOG
#---------------------------------------------------------------------------

# - Settings -

#fsync = true			# turns forced synchronization on or off
#wal_sync_method = fsync	# the default varies across platforms:
				# fsync, fdatasync, open_sync, or open_datasync
#wal_buffers = 8		# min 4, 8KB each
#commit_delay = 0		# range 0-100000, in microseconds
#commit_siblings = 5		# range 1-1000

# - Checkpoints -

#checkpoint_segments = 3	# in logfile segments, min 1, 16MB each
#checkpoint_timeout = 300	# range 30-3600, in seconds
#checkpoint_warning = 30	# 0 is off, in seconds

# - Archiving -

#archive_command = ''		# command to use to archive a logfile segment

3) I have he data backed up in other databases (not as a file backup), so I am really not so concerned about loosing the data (in this specific case).  The problem is that the postmaster isn't starting so I can't even restore the data. Most importantly I would like to learn from this case what to do next time this problem happens to me in the field.

Regards,
Nir.

-----Original Message-----
From: Richard Huxton [mailto:dev@xxxxxxxxxxxx]
Sent: Wednesday, May 25, 2005 11:51 AM
To: Dweck Nir
Cc: postgreSQL mailing list (E-mail)
Subject: Re:  postmaster fails to start


I've taken the liberty of rearranging your email slightly.

Dweck Nir wrote:
> The sequence of events was as follow: 1) computer was shut down
> without stopping postmaster.

OK - not good. Some crucial questions:
1. Do you have fsync enabled or disabled in the postgresql.conf file?
2. Do you know whether your drives are flushing write-cache properly?

> 2) postmaster was started, but because of an error that there might
> be another postmaster running, the postmaster was started again.

Was this just a matter of deleting the .pid file and did you check there 
wasn't another postmaster running?

> 3) since then each time I try to start the postmaster I get the same
> error.


 > LOG:  redo starts at 1/A500075C PANIC:  btree_delete_page_redo: lost
 > target page LOG:  startup process (PID 4409) was terminated by signal
 > 6

OK - well, this error message is in backend/access/nbtree/nbtxlog.c 
where it is replaying the write-ahead-log files for btrees (I'm no 
hacker, I just searched the source for the error message and read the 
comments).

So - it looks like you might have a corrupted WAL. That shouldn't be 
possible if you were running with fsync enabled and drives that flushed 
cache like they should, so I'm guessing that wasn't the case.

It might be possible to recover to a state before this point, but that's 
not something I'm going to be able to advise on. There are two steps you 
should take immediately though.

1. Take a file-backup of your entire data directory and keep it safe. 
You might well be making repeated attempts to recover this.
2. Check your most recent database backup and restore it to another 
machine - it may be quicker to restore that than fix your file corruption.

-- 
   Richard Huxton
   Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@xxxxxxxxxxxxxx


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux