Hi all,
I'm experiencing signal 11 (segmentation fault) failures on the
master node of a 3-node Slony-I cluster. In the past week, we've
averaged a little more than one segfault per day (11 times in the
past 10, including today). Any ideas what's going on?
Would anyone know how to track this issue?
Don't know if attaching log output might help, but it's very similar
to the following (the responses to those threads didn't help us,
though):
http://archives.postgresql.org/pgsql-general/2004-06/msg01204.php
http://www.thescripts.com/forum/thread422225.html
Here's the machine where postgres is faulting:
db1 (Dell 6650):
master Slony-I node
postgreSQL version: 7.4.6
OS: Debian Linux 3.1
CPU: Xeon 4 X 2.5GHz
RAM: 8 GB
DISK:
/ 4 x 18 GB drive: raid 10
/db/data/base 12 x 36 GB: raid 10
/db/data/pg_xlog 2 x 73 GB: raid 1
The other two machines don't die, but they're set up pretty much the
same way. The only difference is that db2 is running 8.1.3.
So what seems odd to me is that db1 and db3 are pretty much identical
(db3 has a 1.40GHz Xeon instead of a 2.5GHz, and some RAM
differences), yet postgres dies all the time on db1, but has yet to
die on db2 or db3, so I'm guessing maybe it's an UPDATE/INSERT/etc.?
Everything was running fine until last Tuesday, when this happened.
We've created no new stored procedures, made no changes, or anything
of the sort.
We've rebooted the db1 machine, but to no avail. Any other suggestions?
Let me know if you need other info...
Any help would be greatly appreciated!
--Richard