Re: rgmanager or clustat problem

Lon Hohberger <lhh@xxxxxxxxxx> · Mon, 9 Apr 2007 16:16:16 -0400

On Mon, Apr 09, 2007 at 12:22:26PM -0500, David M wrote:
> I am running a four node GFS cluster with about 20 services per node.  All
> four nodes belong to the same failover domain, and they each have a priority
> of 1.  My shared storage is an iSCSI SAN.
> 
> After rgmanager has been running for a couple of days, clustat produces the
> following result on all four nodes:
> 
> Timed out waiting for a response from Resource Group Manager
> Member Status: Quorate
> 
>  Member Name                              Status
>  ------ ----                              ------
>  node01           Online, rgmanager
>  node02           Online, Local, rgmanager
>  node03           Online, rgmanager
>  node04           Online, rgmanager
> 
> I also get a time out when I try to determine the status of a particular
> service with "clustat -s servicename".
> 
> All of the services seem to be up and running, but clustat does not work.
> Is there something wrong?  Is there a way for me to increase the time out?
> 
> clurgmgrd and dlm_recvd seem to be using a lot of CPU cycles on Node02, 40
> and 60 percent, respectively.

What version of rgmanager do you have installed?  It sounds like you're
hitting #212644, which is fixed with packages available here:

http://people.redhat.com/lhh/packages.html

(It will also be fixed in the next Red Hat update, which will then
trickle down to CentOS, I suspect)

-- Lon

-- 
Lon Hohberger - Software Engineer - Red Hat, Inc.

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster