cluster not fencing after filesystem failure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I'm having a problem on CentOS 6.5 with a two-node cluster for HA NFS. 
Here's the cluster.conf:  http://pastebin.com/aVAuUDtc

The cluster nodes are VMware guests.  Occasionally the node providing
the NFS service has a problem accessing the disk device (I'm working
with VMware on that...), but long story short -- the kernel shuts down
the XFS filesystem:

Apr 25 02:29:51 sdo-dds-nfsnode2 kernel: XFS (dm-10): metadata I/O
error: block 0x170013a900 ("xlog_iodone") error 5 buf count 65536
Apr 25 02:29:51 sdo-dds-nfsnode2 kernel: XFS (dm-10):
xfs_do_force_shutdown(0x2) called from line 1062 of file
fs/xfs/xfs_log.c.  Return address = 0xffffffffa027f131
Apr 25 02:29:51 sdo-dds-nfsnode2 kernel: XFS (dm-10): Log I/O Error
Detected.  Shutting down filesystem
Apr 25 02:29:51 sdo-dds-nfsnode2 kernel: nfsd: non-standard errno: 5
Apr 25 02:29:51 sdo-dds-nfsnode2 kernel: XFS (dm-10): Please umount the
filesystem and rectify the problem(s)

rgmanager noticed the filesystem problem (see log at
http://pastebin.com/mPPBP2HY ), and marked "HA_nfs" service in a failed
state.

What I'm confused about is why the fencing is not taking place in the
above scenario.  I'm guessing I have either a misunderstanding or
misconfiguration.
At this point I'd like the other node to fence the failed one and take
over.  Or, the failed node to fence itself.

I've tested fencing from the command line and it works:
fence_vmware_soap --ip 192.168.50.9 --username ddsfence --password
secret -z --action reboot -U  "423d288c-03ff-74bf-9a4f-bf661f8ed87b"

I'd appreciate any help with this.

package versions, if it matters:

rgmanager-3.0.12.1-19.el6.x86_64
cman-3.0.12.1-59.el6_5.2.x86_64

-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Robert Jacobson               Robert.C.Jacobson@xxxxxxxx
Lead System Admin       Solar Dynamics Observatory (SDO)
Bldg 14, E222                             (301) 286-1591 

-- 
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster




[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux