> -----Messaggio originale----- > Da: linux-cluster-bounces@xxxxxxxxxx > [mailto:linux-cluster-bounces@xxxxxxxxxx] Per conto di > Fabrizio Lippolis > Inviato: martedì 27 giugno 2006 11.52 > A: linux clustering > Oggetto: Re: R: "Missed too many heartbeats" > messages andhung cluster > > Leandro Dardini ha scritto: > > > If something happens between the two machine, they fence each other. > > I have configured manual fencing but as I wrote it's not much > useful since, I think, requires manual handling which > couldn't be possible immediately. Therefore I am looking for > a method to let the services run even if such a thing > happens. This is not the first time the problem arises, > apparently without a reason, though the last time happened > long time ago. > > > You can try to "ping" each other and see, when the problem > arise, the connectivity state. > > Sometimes the machines are completely locked and it's not > even possible to log in. A brute force switch off is > necessary in this case. Sometimes looks like only the cluster > service is locked and I can regularly ping the other machine > though the cluster is not working. This is really bad. This smells like an hardware problem or buggy kernel driver. Try to stress test the machines individually without cluster support. I usually start with a memtest from a Knoppix CD and then build a kernel for CPU stress. Try to transfer huge chunk of data to test the lan. Leandro > > > Maybe a "too much intelligent switch" is handling the > traffic and have some sort of "traffic shaping and control". > > There is nothing like that, the two machines are connected by > a 1GB crossover cable, not even so long, provided by HP with > the two machines. > > -- > Fabrizio Lippolis > fabrizio.lippolis@xxxxxxxxxxxxxxxxxxxx > Auriga Informatica s.r.l. Via Don Guanella 15/B - > 70124 Bari > Tel.: 080/5025414 - Fax: 080/5027448 - > http://www.aurigainformatica.it/ > > -- > > Linux-cluster@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/linux-cluster > -- Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster