Re: What does FAIL_STOP_WAIT state mean for clvmd and rgmanager

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm not sure possibly it was from doing a "service cman restart"

I understand its always preferrable to reboot with cluster suite but some of our physical hosts can take 20 minutes to do a full reboot, so I'm always look for some way to fix them online.

Joel

On Fri, Sep 10, 2010 at 4:03 AM, Lon Hohberger <lhh@xxxxxxxxxx> wrote:
On Mon, 2010-08-23 at 17:58 +1000, Joel Heenan wrote:
> Can someone please explain what this means and what you can do to get
> out of it:
>
> [root@cluster-host ~]# group_tool -v
> type             level name       id       state node id local_done
> fence            0     default    00010003 JOIN_STOP_WAIT 1 100050001
> 1
> [1 1 2 3 4]
> dlm              1     clvmd      00020003 FAIL_STOP_WAIT 2 200030003
> 1
> [1 2 3 4]
> dlm              1     rgmanager  00030003 FAIL_STOP_WAIT 2 200030003
> 1
> [1 2 3 4]

It looks like fencing has not completed.  How do you have 2 node 1's in
the fencing group?

-- Lon

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux