Re: "Missed too many heartbeats" messages and hung cluster

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Patrick Caulfield ha scritto:

Jun 23 23:37:17 AICLSRV02 kernel: CMAN: removing node AICLSRV01 from the
cluster : Missed too many heartbeats


That message means that the heartbeat messages are getting lost somehow.
either through an unreliable network link or something else odd happening on
the machine to prevent the heartbeat packets reaching the network.

This is very strange since the two machines are connected by a gigabit crossover cable and no other device is in the middle. Also, no firewall rules are configured on any machine.

By the way, actually I am using the fence manual method but it isn't much helpful and I would like to switch to a method that ensures a reliable service. Does it mean I have to buy a device sitting in the middle of the machines that connects network and power cables? I am rather new to it so please any suggestion is welcome.

--
Fabrizio Lippolis                fabrizio.lippolis@xxxxxxxxxxxxxxxxxxxx
Auriga Informatica s.r.l.            Via Don Guanella 15/B - 70124 Bari
Tel.: 080/5025414 - Fax: 080/5027448 - http://www.aurigainformatica.it/

--

Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux