On Sat, Jun 30, 2007 at 01:41:03PM -0500, Chris Harms wrote: > I am experiencing periodic failovers due to a floating IP address not > passing the status check: > > clurgmgrd: [9975]: <warning> Failed to ping 192.168.13.204 > Jun 30 11:41:47 nodeA clurgmgrd[9975]: <notice> status on ip > "192.168.13.204" returned 1 (generic error) > > Both nodes have bonded NICs with gigabit connections to redundant > switches, so it is unlikely they are going down, nothing in the logs > about linux losing the links. I parked all the cluster services - 2 > Postgres services and 1 Apache - on one node and allowed it to run > overnight. There would be no client activity during this time. One > Postgres service failed two times in this manner and the other failed > once in this manner. The Apache service did not fail. > > What can I do to resolve this or get more information out of the system > to resolve this? Hmm, with bonded NICs, ip.sh monitors the links of the physical devices. It's supposed to check and not complain if either link is up. The ping bit is a bit weird; you could just disable it in /usr/share/cluster/ip.sh. I.e. change the 'ping' line to '/bin/true' -- Lon Hohberger - Software Engineer - Red Hat, Inc. -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster