Please send your doubts / questions on mailing list "linux-cluster@xxxxxxxxxx" instead of addressing personally.
Regarding configuration for manual fencing - I don't have it with me, it was available with RHEL 5.5. Check it out in system-config-cluster tool if you can add manual fencing.
Thanks,
Parvez
On Wed, Oct 3, 2012 at 10:46 AM, Renchu Mathew <renchumv@xxxxxxxxx> wrote:
Hi Purvez,I am trying to setup a test cluster environmet. But I haven't doen fencing. Please find below error messages. Some time after the nodes restarted, the other node is going down. can you please send me theconfiguration for manual fencing?> Please find attached my cluster setup. It is not stable
> and /var/log/messages shows the below errors.
>
>
> Sep 11 08:49:10 node1 corosync[1814]: [QUORUM] Members[2]: 1 2
> Sep 11 08:49:10 node1 corosync[1814]: [QUORUM] Members[2]: 1 2
> Sep 11 08:49:10 node1 corosync[1814]: [CPG ] chosen downlist:
> sender r(0) ip(192.168.1.251) ; members(old:2 left:1)
> Sep 11 08:49:10 node1 corosync[1814]: [MAIN ] Completed service
> synchronization, ready to provide service.
> Sep 11 08:49:11 node1 corosync[1814]: cman killed by node 2 because we
> were killed by cman_tool or other application
> Sep 11 08:49:11 node1 fenced[1875]: telling cman to remove nodeid 2
> from cluster
> Sep 11 08:49:11 node1 fenced[1875]: cluster is down, exiting
> Sep 11 08:49:11 node1 gfs_controld[1950]: cluster is down, exiting
> Sep 11 08:49:11 node1 gfs_controld[1950]: daemon cpg_dispatch error 2
> Sep 11 08:49:11 node1 gfs_controld[1950]: cpg_dispatch error 2
> Sep 11 08:49:11 node1 dlm_controld[1889]: cluster is down, exiting
> Sep 11 08:49:11 node1 dlm_controld[1889]: daemon cpg_dispatch error 2
> Sep 11 08:49:11 node1 dlm_controld[1889]: cpg_dispatch error 2
> Sep 11 08:49:11 node1 dlm_controld[1889]: cpg_dispatch error 2
> Sep 11 08:49:11 node1 dlm_controld[1889]: cpg_dispatch error 2
> Sep 11 08:49:11 node1 fenced[1875]: daemon cpg_dispatch error 2
> Sep 11 08:49:11 node1 rgmanager[2409]: #67: Shutting down uncleanly
> Sep 11 08:49:11 node1 rgmanager[17059]: [clusterfs] unmounting /Data
> Sep 11 08:49:11 node1 rgmanager[17068]: [clusterfs] Sending SIGTERM to
> processes on /Data
> Sep 11 08:49:16 node1 rgmanager[17104]: [clusterfs] unmounting /Data
> Sep 11 08:49:16 node1 rgmanager[17113]: [clusterfs] Sending SIGKILL to
> processes on /Data
> Sep 11 08:49:19 node1 kernel: dlm: closing connection to node 2
> Sep 11 08:49:19 node1 kernel: dlm: closing connection to node 1
> Sep 11 08:49:19 node1 kernel: dlm: gfs2: no userland control daemon,
> stopping lockspace
> Sep 11 08:49:22 node1 rgmanager[17149]: [clusterfs] unmounting /Data
> Sep 11 08:49:22 node1 rgmanager[17158]: [clusterfs] Sending SIGKILL to
> processes on /Data
>
>
>
> Also when I try to restart the cman service, below error comes.
> Starting cluster:
> Checking if cluster has been disabled at boot... [ OK ]
> Checking Network Manager... [ OK ]
> Global setup... [ OK ]
> Loading kernel modules... [ OK ]
> Mounting configfs... [ OK ]
> Starting cman... [ OK ]
> Waiting for quorum... [ OK ]
> Starting fenced... [ OK ]
> Starting dlm_controld... [ OK ]
> Starting gfs_controld... [ OK ]
> Unfencing self... fence_node: cannot connect to cman
> [FAILED]
> Stopping cluster:
> Leaving fence domain... [ OK ]
> Stopping gfs_controld... [ OK ]
> Stopping dlm_controld... [ OK ]
> Stopping fenced... [ OK ]
> Stopping cman... [ OK ]
> Unloading kernel modules... [ OK ]
> Unmounting configfs... [ OK ]
>
> Thanks again.
> Renchu Mathew
> On Tue, Sep 11, 2012 at 9:10 PM, Arun Eapen CISSP, RHCA
> <arun@xxxxxxxxxx> wrote:
>
>
>
> Put the fenced in debug mode and copy the error messages, for
> me to
> debug
>
> On Tue, 2012-09-11 at 11:52 +0400, Renchu Mathew wrote:
> > Hi Arun,
> >
> > I have done the RH436 course in conducted by you at Redhat
> b'lore. How
> > r u?
> >
> > I have configured a 2 node failover cluster setup (almost
> same like
> > our RH436 lab setup in b'lore) It is almost ok except
> fencing. If I
> > pull the active node network cable it is not switching to
> the other
> > automatically. It is getting hung. Then I have to do this
> manually. Is
> > there any script for creating the dummy fencing in RHCS
> which will
> > restart or shutdown the other node. Please find attached my
> > cluster.conf file. is there anyway we can power fence using
> APC UPS.
> >
> > Could you please help me if you get some time.
> >
> > Thanks and regards
> > Renchu Mathew
> >
> >
> >
>
>
>
> --
> Arun Eapen
> CISSP, RHC{A,DS,E,I,SS,VA,X}
> Senior Technical Consultant & Certification Poobah
> Red Hat India Pvt. Ltd.,
> No - 4/1, Bannergatta Road,
> IBC Knowledge Park,
> 11th floor, Tower D,
> Bangalore - 560029, INDIA.
>
>
>
--
Arun Eapen
CISSP, RHC{A,DS,E,I,SS,VA,X}
Senior Technical Consultant & Certification Poobah
Red Hat India Pvt. Ltd.,
No - 4/1, Bannergatta Road,
IBC Knowledge Park,
11th floor, Tower D,
Bangalore - 560029, INDIA.
-- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster