On Mon, Apr 09, 2007 at 12:22:26PM -0500, David M wrote: > I am running a four node GFS cluster with about 20 services per node. All > four nodes belong to the same failover domain, and they each have a priority > of 1. My shared storage is an iSCSI SAN. > > After rgmanager has been running for a couple of days, clustat produces the > following result on all four nodes: > > Timed out waiting for a response from Resource Group Manager > Member Status: Quorate > > Member Name Status > ------ ---- ------ > node01 Online, rgmanager > node02 Online, Local, rgmanager > node03 Online, rgmanager > node04 Online, rgmanager > > I also get a time out when I try to determine the status of a particular > service with "clustat -s servicename". > > All of the services seem to be up and running, but clustat does not work. > Is there something wrong? Is there a way for me to increase the time out? > > clurgmgrd and dlm_recvd seem to be using a lot of CPU cycles on Node02, 40 > and 60 percent, respectively. What version of rgmanager do you have installed? It sounds like you're hitting #212644, which is fixed with packages available here: http://people.redhat.com/lhh/packages.html (It will also be fixed in the next Red Hat update, which will then trickle down to CentOS, I suspect) -- Lon -- Lon Hohberger - Software Engineer - Red Hat, Inc. -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster