fc6 two-node cluster with gfs2 not working

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



So i've got two dell blades setup, with multipath. they even cluster together, but once one is up and has the gfs mounted the other can't start the gfs2 service. I'm basing my setup on how I was setting up gfs w/ rhel4, i realize this newer way has some more niceties to it, and i must be doing something wrong, but i am not seeing much documentation on the differences so i am just trying to pull this off this way.


Basic rundown of setup:

decently minimal non-X install
local drives are not lvm'd (which btw, if you have 2 boxes setup differently, one w/ lvm and one w/o it makes clvm a pita)


yum update
yum install screen ntp cman lvm2-cluster gfs2-utils

put on good firewall config (or turn it off, both behave the same)

selinux turned down to permissive

see attached multipath.conf and cluster.conf

after updating the multipath.conf i do this:
mkinitrd -f /boot/initrd-`uname -r` `uname -r`
init 6; exit

modprobe dm-multipath
modprobe dm-round-robin
service multipathd start

that part looks just fine.


then after updating the cluster.conf on each node i do 'ccs_tool addnodeids' (it said to do this when i tried to start cman the first time).

then service cman start

everything looks fine, pvcreate, vgcreate, lvcreate, mkfs.gfs2, voila we have a gfs formatted drive visible on both systems.

i add the /etc/fstab entry, and create the mount point.

next i start clvmd, then gfs2.

the first box starts gfs2 just fine, second won't, it hangs at this (from var/log/messages):

Nov 1 22:41:07 box2 kernel: audit(1162442467.427:150): avc: denied { connectto } for pid=3724 comm="mount.gfs2" path=006766735F636F6E74726F6C645F736F6$ Nov 1 22:41:07 box2 kernel: GFS2: fsid=: Trying to join cluster "lock_dlm", "outMail:data" Nov 1 22:41:07 box2 kernel: audit(1162442467.451:151): avc: denied { search } for pid=3724 comm="mount.gfs2" name="dlm" dev=debugfs ino=13186 scontext$
Nov 1 22:41:07 box2 kernel: dlm: data: recover 1
Nov 1 22:41:07 box2 kernel: GFS2: fsid=outMail:data.1: Joined cluster. Now mounting FS...
Nov 1 22:41:07 box2 kernel: dlm: data: add member 1
Nov 1 22:41:07 box2 kernel: dlm: data: add member 2
Nov 1 22:49:07 box2 gfs_controld[3639]: mount: failed -17


Remember it is set to permissive.

So I shut down the box that came up fine on its own, manually enabled the services on box2 (the box that wasnt coming up) and it works fine. Turned on the box1, and at boot it is hanging at the same place box2 was.

I also realize that a 2 node cluster is not prefered, but its what i'm setting up, what i have access to at the moment, and honestly i'm not sure that i believe a 3rd box would help (but it might).

Any suggestions?

-greg

--
http://www.gvtc.com
--
“While it is possible to change without improving, it is impossible to improve without changing.” -anonymous

“only he who attempts the absurd can achieve the impossible.” -anonymous

<?xml version="1.0"?>
<cluster name="outMail" config_version="2">
    <cman two_node="1" expected_votes="1">
    </cman>
    <clusternodes>
      <clusternode name="goumang.sgc" votes="1" nodeid="1">
       <fence>
        <method name="single">
         <device name="human" ipaddr="172.16.1.180"/>
       </method>
      </fence>
     </clusternode>
     <clusternode name="rushou.sgc" votes="1" nodeid="2">
      <fence>
       <method name="single">
         <device name="human" ipaddr="172.16.1.185"/>
       </method>
      </fence>
    </clusternode>
   </clusternodes>
  <fence_devices>
   <fence_device name="human" agent="fence_manual"/>
  </fence_devices>
 </cluster>
## This is the /etc/multipath.conf file recommended for
## EMC storage devices.
##
## OS : RHEL 4 U3
## Arrays : CLARiiON and Symmetrix
##
## The blacklist is the enumeration of all devices that are to be
## excluded from multipath control
devnode_blacklist
{
        ## Replace the wwid with the output of the command
        ## 'scsi_id -g -u -s /block/[internal scsi disk name]'
        ## Enumerate the wwid for all internal scsi disks.
        ## Optionally, the wwid of VCM database may also be listed here.
        ## 3-4 Native Multipath Failover, DM-MPIO for v2.6.x Linux Kernel
        ## and EMC Storage Arrays Configuration Guide
        ## Native Multipath Failover on RHEL 4
        ##wwid 35005076718 d4224d
        ##or just do this:
        devnode "^sda"
        devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
        devnode "^hd[a-z][0-9]*"
        devnode "^cciss!c[0-9]d[0-9]*[p[0-9]*]"
}
## Use user friendly names, instead of using WWIDs as names.
defaults {
        ## Use user friendly names, instead of using WWIDs as names.
        user_friendly_names yes
}
devices {
        ## Device attributes requirements for EMC Symmetrix
        ## are part of the default definitions and do not require separate
        ## definition.
        ## Device attributes for EMC CLARiiON
        device {
                vendor "DGC "
                product "*"
                path_grouping_policy group_by_prio
                getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
                prio_callout "/sbin/mpath_prio_emc /dev/%n"
                path_checker emc_clariion
                path_selector "round-robin 0"
                features "1 queue_if_no_path"
                no_path_retry 300
                hardware_handler "1 emc"
                failback immediate
        }
}
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux