Re: dlm-pcmk-3.0.17-1.fc14.x86_64 and gfs-pcmk-3.0.17-1.fc14.x86_64 woes

Gregory Bartholomew <gregory.lee.bartholomew@xxxxxxxxx> · Thu, 10 Mar 2011 13:42:57 -0600

FYI, per:

> Cluster shutdown tips
> ---------------------
>
> * Avoiding a partly shutdown cluster due to lost quorum.
>
> There is a practical timing issue with respect to the shutdown steps 
being run
> on all nodes when shutting down an entire cluster (or most of it).  When
> shutting down the entire cluster (or shutting down a node for an extended
> period) use "cman_tool leave remove".  This automatically reduces the 
number
> of votes needed for quorum as each node leaves and prevents the loss 
of quorum
> which could keep the last nodes from cleanly completing shutdown.
>
> Using the "remove" leave option should not be used in general since it
> introduces potential split-brain risks.
>
> If the "remove" leave option is not used, quorum will be lost after 
enough
> nodes have left the cluster.  Once the cluster is inquorate, 
remaining members
> that have not yet completed "fence_tool leave" in the steps above will be
> stuck.  Operations such as umounting gfs or leaving the fence domain will
> block while the cluster is inquorate.  They can continue and complete 
only
> when quorum is regained.
>
> If this happens, one option is to join the cluster ("cman_tool join") 
on some
> of the nodes that have left so that the cluster regains quorum and 
the stuck
> nodes can complete their shutdown.  Another option is to forcibly 
reduce the
> number of expected votes for the cluster which allows the cluster to 
become
> quorate again ("cman_tool expected <votes>").
>
> ...
>
> Two node clusters
> -----------------
>
> Ordinarily the loss of quorum after one node fails out of two will 
prevent the
> remaining node from continuing (if both nodes have one vote.)  Some 
special
> configuration options can be set to allow the one remaining node to 
continue
> operating if the other fails.  To do this only two nodes with one 
vote each can
> be defined in cluster.conf.  The two_node and expected_votes values 
must then be
> set to 1 in the cman config section as follows.
>
>   <cman two_node="1" expected_votes="1">
>   </cman>
>

In http://sourceware.org/cluster/doc/usage.txt, it looks like example 
C.1 in 
http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Clusters_from_Scratch/index.html#ap-cman 
should be changed to:

<?xml version="1.0"?>
<cluster config_version="1" name="beekhof">
  <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
  <clusternodes>
    <clusternode name="pcmk-1" nodeid="1">
      <fence/>
    </clusternode>
    <clusternode name="pcmk-2" nodeid="2">
      <fence/>
    </clusternode>
  </clusternodes>
  <cman two_node="1" expected_votes="1"/>
  <fencedevices/>
  <rm/>
</cluster>

gb

On 03/10/2011 09:52 AM, Gregory Bartholomew wrote:
On 03/10/2011 01:14 AM, Andrew Beekhof wrote:
On Wed, Mar 9, 2011 at 7:03 PM, Gregory Bartholomew
<gregory.lee.bartholomew@xxxxxxxxx> wrote:
Never mind, I figured it out ... I needed to install the gfs2-cluster
package and start its service and I also had a different name for my
cluster
in /etc/cluster/cluster.conf than what I was using in my mkfs.gfs2
command.

It's all working now. Thanks to those who helped me get this going,

So you're still using Pacemaker to mount/unmount the filesystem and
other services?
If so, were there any discrepancies in the documentation describing
how to configure this?

Good morning,

This is what I did to get the file system going:

-----

yum install -y httpd gfs2-cluster gfs2-utils
chkconfig gfs2-cluster on
service gfs2-cluster start

mkfs.gfs2 -p lock_dlm -j 2 -t siue-cs:iscsi /dev/sda1

cat <<-END | crm
configure primitive gfs ocf:heartbeat:Filesystem params
device="/dev/sda1" directory="/var/www/html" fstype="gfs2" op start
interval="0" timeout="60s" op stop interval="0" timeout="60s"
configure clone dual-gfs gfs
END

-----

I think this sed command was also missing from the guide:

sed -i '/^#<Location \/server-status>/,/#<\/Location>/{s/^#//;s/Allow
from .example.com/Allow from 127.0.0.1/}' /etc/httpd/conf/httpd.conf

I've attached the full record of all the commands that I used to set up
my nodes to this email. It has, at the end, the final result of "crm
configure show".

gb

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster