Hi, our 8.1.3 system on quad Xeon has been happily chugging away for
weeks with no stability problems until yesterday:
/var/log/syslog:May 4 11:57:17 cayenne kernel: postmaster[19291]:
segfault at 0000000000000000 rip 00002aaaab5e8c00 rsp 00007fffffffd418
error 4
/var/log/syslog.0:May 3 09:39:06 cayenne kernel: postmaster[32698]:
segfault at 0000000000000000 rip 00002aaaab5e8c00 rsp 00007fffffffd418
error 4
/var/log/syslog.0:May 3 11:02:00 cayenne kernel: postmaster[12427]:
segfault at 0000000000000000 rip 00002aaaab5e8c00 rsp 00007fffffffd418
error 4
I don't know what the rip + rsp values represent, but is it interesting
that they are identical in all three cases?
Not a single OS change has occurred on the machine - in fact the only
thing happening other than pg itself is me tail'ing the logs..
I'm using Debian sarge with the 8.1.3 debs from backports.org which I
trust; I doubt running postmaster under gdb will be workable due to the
performance penalty.
The pg logs don't show much of interest:
2006-05-04 11:57:17 BST LOG: server process (PID 19291) was terminated
by signal 11
2006-05-04 11:57:17 BST LOG: terminating any other active server processes
2006-05-04 11:57:17 BST WARNING: terminating connection because of
crash of another server process
2006-05-04 11:57:17 BST DETAIL: The postmaster has commanded this
server process to roll back the current transaction and exit, because
another server process exited abnormally and possibly corrupted shared
memory.
2006-05-04 11:57:17 BST HINT: In a moment you should be able to
reconnect to the database and repeat your command.
[loads of these]
2006-05-04 11:57:18 BST FATAL: the database system is in recovery mode
2006-05-04 11:57:18 BST FATAL: the database system is in recovery mode
2006-05-04 11:57:18 BST FATAL: the database system is in recovery mode
2006-05-04 11:57:18 BST FATAL: the database system is in recovery mode
2006-05-04 11:57:18 BST LOG: all server processes terminated;
reinitializing
2006-05-04 11:57:18 BST FATAL: the database system is starting up
2006-05-04 11:57:18 BST FATAL: the database system is starting up
2006-05-04 11:57:18 BST FATAL: the database system is starting up
2006-05-04 11:57:18 BST FATAL: the database system is starting up
2006-05-04 11:57:18 BST FATAL: the database system is starting up
2006-05-04 11:57:18 BST LOG: database system was interrupted at
2006-05-04 11:56:17 BST
2006-05-04 11:57:18 BST LOG: checkpoint record is at 68/A9D2F2E8
2006-05-04 11:57:18 BST LOG: redo record is at 68/A9D17DD0; undo record
is at 0/0; shutdown FALSE
2006-05-04 11:57:18 BST LOG: next transaction ID: 728532363; next OID:
183302937
2006-05-04 11:57:18 BST LOG: next MultiXactId: 46957; next
MultiXactOffset: 98539
2006-05-04 11:57:18 BST LOG: database system was not properly shut
down; automatic recovery in progress
2006-05-04 11:57:18 BST LOG: redo starts at 68/A9D17DD0
2006-05-04 11:57:18 BST FATAL: the database system is starting up
[ loads of these]
2006-05-04 11:57:19 BST LOG: record with zero length at 68/ABAF4F48
2006-05-04 11:57:19 BST LOG: redo done at 68/ABAF4F18
2006-05-04 11:57:19 BST LOG: could not truncate directory
"pg_multixact/members": apparent wraparound
2006-05-04 11:57:19 BST LOG: database system is ready
2006-05-04 11:57:19 BST LOG: transaction ID wrap limit is 1362094701,
limited by database "postgres"
Encouragingly, pg_config shows that --enable_debug was passed as a
./configure argument:
CONFIGURE = '--build=x86_64-linux' '--prefix=/usr'
'--includedir=/usr/include' '--mandir=/usr/share/man'
'--infodir=/usr/share/info' '--sysconfdir=/etc' '--localstatedir=/var'
'--libexecdir=/usr/lib/postgresql-8.1' '--srcdir=.'
'--disable-maintainer-mode' '--mandir=/usr/share/postgresql/8.1/man'
'--with-docdir=/usr/share/doc/postgresql-doc-8.1'
'--datadir=/usr/share/postgresql/8.1'
'--bindir=/usr/lib/postgresql/8.1/bin'
'--includedir=/usr/include/postgresql/' '--enable-nls'
'--enable-integer-datetimes' '--enable-debug' '--disable-rpath'
'--with-tcl' '--with-perl' '--with-python' '--with-pam' '--with-krb5'
'--with-openssl' '--with-gnu-ld' '--with-tclconfig=/usr/lib/tcl8.4'
'--with-tkconfig=/usr/lib/tk8.4' '--with-includes=/usr/include/tcl8.4'
'--with-pgport=5432' '--enable-thread-safety' 'CC=cc' 'CFLAGS=-g -Wall
-O2 -Wl,--as-needed' 'build_alias=x86_64-linux'
CC = cc
CPPFLAGS = -D_GNU_SOURCE -I/usr/include/tcl8.4
CFLAGS = -g -Wall -O2 -Wl,--as-needed -Wall -Wmissing-prototypes
-Wpointer-arith -Winline -Wendif-labels -fno-strict-aliasing -g
CFLAGS_SL = -fpic
LDFLAGS =
LDFLAGS_SL =
LIBS = -lpgport -lpam -lssl -lcrypto -lkrb5 -lz -lreadline -lcrypt
-lresolv -lnsl -ldl -lm
VERSION = PostgreSQL 8.1.3
How can I enable coredumps or something similarly useful for debugging
purposes?
Cheers,
Gavin.