R: R: "Missed too many heartbeats" messages andhung cluster

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



 

> -----Messaggio originale-----
> Da: linux-cluster-bounces@xxxxxxxxxx 
> [mailto:linux-cluster-bounces@xxxxxxxxxx] Per conto di 
> Fabrizio Lippolis
> Inviato: martedì 27 giugno 2006 11.52
> A: linux clustering
> Oggetto: Re: R:  "Missed too many heartbeats" 
> messages andhung cluster
> 
> Leandro Dardini ha scritto:
> 
> > If something happens between the two machine, they fence each other.
> 
> I have configured manual fencing but as I wrote it's not much 
> useful since, I think, requires manual handling which 
> couldn't be possible immediately. Therefore I am looking for 
> a method to let the services run even if such a thing 
> happens. This is not the first time the problem arises, 
> apparently without a reason, though the last time happened 
> long time ago.
> 
> > You can try to "ping" each other and see, when the problem 
> arise, the connectivity state.
> 
> Sometimes the machines are completely locked and it's not 
> even possible to log in. A brute force switch off is 
> necessary in this case. Sometimes looks like only the cluster 
> service is locked and I can regularly ping the other machine 
> though the cluster is not working.

This is really bad. This smells like an hardware problem or buggy kernel driver. Try to stress test the machines individually without cluster support. I usually start with a memtest from a Knoppix CD and then build a kernel for CPU stress. Try to transfer huge chunk of data to test the lan.

Leandro

> 
> > Maybe a "too much intelligent switch" is handling the 
> traffic and have some sort of "traffic shaping and control".
> 
> There is nothing like that, the two machines are connected by 
> a 1GB crossover cable, not even so long, provided by HP with 
> the two machines.
> 
> -- 
> Fabrizio Lippolis                
> fabrizio.lippolis@xxxxxxxxxxxxxxxxxxxx
> Auriga Informatica s.r.l.            Via Don Guanella 15/B - 
> 70124 Bari
> Tel.: 080/5025414 - Fax: 080/5027448 - 
> http://www.aurigainformatica.it/
> 
> --
> 
> Linux-cluster@xxxxxxxxxx
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 

--

Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux