On Thu, Aug 02, 2007 at 02:00:13PM -0500, Jay Leafey wrote: > Lon Hohberger wrote: > >On Tue, Jul 31, 2007 at 10:48:44AM -0500, Jay Leafey wrote: > >>I've got a 3-node cluster running CentOS 4.5 and I cannot communicate > >>with the resource group manager. When I use the clustat command I get a > >>timeout: > >> > >>>[root@rapier ~]# clustat > >>>Timed out waiting for a response from Resource Group Manager > >>>Member Status: Quorate > >>> > >>> Member Name Status > >>> ------ ---- ------ > >>> rapier.utmem.edu Online, Local, rgmanager > >>> thorax.utmem.edu Offline > >>> cyclops.utmem.edu Online, rgmanager > > > >>>Fence Domain: "default" 2 2 recover 4 - > >>>[1 2] > > > >Until fencing completes, rgmanager won't respond. > > > >fence_ack_manual needs to be run. > > > >>><SNIP> > >>> > >>>User: "usrm::manager" 10 10 recover 2 - > >>>[1 2] > >>> > > > > Your reply was a bit confusing at first, but looking deeper showed you > were right on the mark. The systems (using HP ILO fencing) were unable > to communicate with each other very well or with the ILO ports at all. > Turns out some of the ports they were configured on had been moved to a > different VLAN, so the network was split between the ILOs and the host > ports. Sorry, I just assumed you were using manual fencing as opposed to iLO, since that's the 90+/- % case of why fencing was stuck in the 'recover' state. I guess we all know what happens when you assume... :) Or maybe, when I assume? -- Lon -- Lon Hohberger - Software Engineer - Red Hat, Inc. -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster