Re: Why does GFS2 freeze?

David Teigland <teigland@xxxxxxxxxx> · Wed, 27 Nov 2013 10:15:27 -0500

On Wed, Nov 27, 2013 at 05:03:34PM +0200, Vladimir Melnik wrote:
> Dear colleagues,
> 
> Thank you all for all your yesterday's help. Alas, I had to reboot both
> nodes about 5am, because the situation has gone critical.
> 
> Now I built almost the same cluster in a lab. Its configuration is REALLY
> simple:
> 
> <?xml version="1.0"?>
> <cluster name="cluster1" config_version="1">
>   <cman two_node="1" expected_votes="1"/>
>   <clusternodes>
>     <clusternode name="vlan201.eth0.host1.lab.***.net" votes="1"
> nodeid="1"/>
>     <clusternode name="vlan201.eth0.host2.lab.***.net" votes="1"
> nodeid="2"/>
>   </clusternodes>
>   <fencedevices/>
>   <rm>
>     <failoverdomains/>
>     <resources/>
>   </rm>
> </cluster>
> 
> I started iscsi, cman, clvmd, gfs2 and rgmanager. When I'm disconnecting a
> node, the second node can't work with the same filesystem: any attempt to
> open some file just waits. But when I'm reconnecting the node back, files
> are opening and all is working.
> 
> I'm sorry if it's a noob's question, but why does it ignore "<cman
> two_node="1" expected_votes="1"/>"?

The command line tools and /var/log/messages will probably reveal that
everything is waiting on fencing to complete for the disconnected node.
Since you do not have fencing configured, you need to manually reset the
disconnected node and then notify fenced that you have done so (using
fence_ack_manual).  See various man pages, e.g. fenced(8).

-- 
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster