Hi All,
I'm relatively new to postgres after inheriting this server from a
previous admin so please bear with me if these are obvious
questions.
My scenario:
- Last weekend I had a scheduled maintenance window for
power/air conditioning work.
- Prior to this outage the server was running fine.
- All servers were shutdown cleanly prior to the outage. As far
as I can tell.
- My postgre service failed to come back after the outage.
- I have no logs to indicate an error.
- I'm running:
- postgresql-server 9.2.13
- CentOS 6.6
- I have backups and a development machine that I can roll to in
the short term but I'd really like to get this sorted out.
The list of what I've tried so far is:
- Disk space on the data partition
- Start in single user mode in the foreground
- Start in multi user mode in the foreground
Details of these are outlined below.
Disk Space
Google seems to suggest that the main suspect in this area is disk
space. df suggests that this is not the case:
df -hl /var/lib/pgsql/
Filesystem Size Used Avail Use% Mounted on
/dev/sdc1 917G 519G 353G 60% /var/lib/pgsql
There are, however, some quite large tables and indices so I
can't definitively rule this out.
Single User Mode
I can start the server in single user mode
-bash-4.1$ /usr/pgsql-9.2/bin/postmaster --single -p 5432 -D
/var/lib/pgsql/9.2/data -d 5
DEBUG: invoking IpcMemoryCreate(size=6612303872)
DEBUG: SlruScanDirectory invoking callback on pg_notify/0000
DEBUG: removing file "pg_notify/0000"
DEBUG: InitPostgres
DEBUG: my backend ID is 1
LOG: database system was shut down at 2015-12-14 10:35:37 AEDT
DEBUG: checkpoint record is at 3ED/7EC8CA68
DEBUG: redo record is at 3ED/7EC8CA68; shutdown TRUE
DEBUG: next transaction ID: 0/343785580; next OID: 136370
DEBUG: next MultiXactId: 441782; next MultiXactOffset: 956443
DEBUG: oldest unfrozen transaction ID: 150016627, in database
12870
DEBUG: transaction ID wrap limit is 2297500274, limited by
database with OID 12870
DEBUG: StartTransaction
DEBUG: name: unnamed; blockState: DEFAULT; state: INPROGR,
xid/subid/cid: 0/1/0, nestlvl: 1, children:
DEBUG: CommitTransaction
DEBUG: name: unnamed; blockState: STARTED; state: INPROGR,
xid/subid/cid: 0/1/0, nestlvl: 1, children:
PostgreSQL stand-alone backend 9.2.13
backend> [CTRL-D] DEBUG: shmem_exit(0): 11 callbacks to make
LOG: shutting down
DEBUG: SlruScanDirectory invoking callback on
pg_multixact/offsets/0006
DEBUG: SlruScanDirectory invoking callback on
pg_multixact/members/000E
DEBUG: attempting to remove WAL segments older than log file
00000001000003ED0000007D
DEBUG: SlruScanDirectory invoking callback on pg_subtrans/147D
LOG: database system is shut down
DEBUG: proc_exit(0): 3 callbacks to make
DEBUG: exit(0)
DEBUG: shmem_exit(-1): 0 callbacks to make
DEBUG: proc_exit(-1): 0 callbacks to make
There does not appear to be any errors there.
Multi-User Mode
I cannot start in multi-user mode:
-bash-4.1$ /usr/pgsql-9.2/bin/postmaster -p 5432 -D
/var/lib/pgsql/9.2/data -d 5
DEBUG: postmaster: PostmasterMain: initial environment dump:
DEBUG: -----------------------------------------
DEBUG: HOSTNAME=syd-pgsql-00.wmawater.com.au
DEBUG: SHELL=/bin/bash
DEBUG: TERM=xterm-256color
DEBUG: HISTSIZE=1000
DEBUG: QTDIR=/usr/lib64/qt-3.3
DEBUG: QTINC=/usr/lib64/qt-3.3/include
DEBUG: USER=postgres
<snip DEBUG: LS_COLORS>
DEBUG: MAIL=/var/spool/mail/postgres
DEBUG:
PATH=/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin
DEBUG: PWD=/var/lib/pgsql
DEBUG: LANG=en_US.UTF-8
DEBUG: HISTCONTROL=ignoredups
DEBUG: SHLVL=1
DEBUG: HOME=/var/lib/pgsql
DEBUG: LOGNAME=postgres
DEBUG: QTLIB=/usr/lib64/qt-3.3/lib
DEBUG: PGDATA=/var/lib/pgsql/9.2/data
DEBUG: LESSOPEN=||/usr/bin/lesspipe.sh %s
DEBUG: G_BROKEN_FILENAMES=1
DEBUG: _=/usr/pgsql-9.2/bin/postmaster
DEBUG: PGLOCALEDIR=/usr/pgsql-9.2/share/locale
DEBUG: PGSYSCONFDIR=/etc/sysconfig/pgsql
DEBUG: LC_COLLATE=en_US.UTF-8
DEBUG: LC_CTYPE=en_US.UTF-8
DEBUG: LC_MESSAGES=en_US.UTF-8
DEBUG: LC_MONETARY=C
DEBUG: LC_NUMERIC=C
DEBUG: LC_TIME=C
DEBUG: -----------------------------------------
DEBUG: invoking IpcMemoryCreate(size=6612303872)
DEBUG: SlruScanDirectory invoking callback on pg_notify/0000
DEBUG: removing file "pg_notify/0000"
DEBUG: max_safe_fds = 984, usable_fds = 1000, already_open = 6
DEBUG: logger shutting down
DEBUG: shmem_exit(0): 0 callbacks to make
DEBUG: proc_exit(0): 0 callbacks to make
DEBUG: exit(0)
DEBUG: shmem_exit(-1): 0 callbacks to make
DEBUG: proc_exit(-1): 0 callbacks to make
Again there appears to be nothing logged to indicate why the
server is not starting at this point.
I would be very grateful for any suggestions at this point.
Cheers,
-pete