Re: Hi

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



A curious observation, there is a sudden surge of sending emails on private addresses rather than sending over a mailing list.

Please send your doubts / questions on mailing list "linux-cluster@xxxxxxxxxx" instead of addressing personally.

Regarding configuration for manual fencing - I don't have it with me, it was available with RHEL 5.5. Check it out in system-config-cluster tool if you can add manual fencing.

Thanks,
Parvez

On Wed, Oct 3, 2012 at 10:46 AM, Renchu Mathew <renchumv@xxxxxxxxx> wrote:
Hi Purvez,
 
I am trying to setup a test cluster environmet. But I haven't doen fencing. Please find below error messages. Some time after the nodes restarted, the other node is going down. can you please send me theconfiguration for manual fencing?
 
> Please find attached my cluster setup. It is not stable
> and /var/log/messages shows the below errors.
>
>
> Sep 11 08:49:10 node1 corosync[1814]:   [QUORUM] Members[2]: 1 2
> Sep 11 08:49:10 node1 corosync[1814]:   [QUORUM] Members[2]: 1 2
> Sep 11 08:49:10 node1 corosync[1814]:   [CPG   ] chosen downlist:
> sender r(0) ip(192.168.1.251) ; members(old:2 left:1)
> Sep 11 08:49:10 node1 corosync[1814]:   [MAIN  ] Completed service
> synchronization, ready to provide service.
> Sep 11 08:49:11 node1 corosync[1814]: cman killed by node 2 because we
> were killed by cman_tool or other application
> Sep 11 08:49:11 node1 fenced[1875]: telling cman to remove nodeid 2
> from cluster
> Sep 11 08:49:11 node1 fenced[1875]: cluster is down, exiting
> Sep 11 08:49:11 node1 gfs_controld[1950]: cluster is down, exiting
> Sep 11 08:49:11 node1 gfs_controld[1950]: daemon cpg_dispatch error 2
> Sep 11 08:49:11 node1 gfs_controld[1950]: cpg_dispatch error 2
> Sep 11 08:49:11 node1 dlm_controld[1889]: cluster is down, exiting
> Sep 11 08:49:11 node1 dlm_controld[1889]: daemon cpg_dispatch error 2
> Sep 11 08:49:11 node1 dlm_controld[1889]: cpg_dispatch error 2
> Sep 11 08:49:11 node1 dlm_controld[1889]: cpg_dispatch error 2
> Sep 11 08:49:11 node1 dlm_controld[1889]: cpg_dispatch error 2
> Sep 11 08:49:11 node1 fenced[1875]: daemon cpg_dispatch error 2
> Sep 11 08:49:11 node1 rgmanager[2409]: #67: Shutting down uncleanly
> Sep 11 08:49:11 node1 rgmanager[17059]: [clusterfs] unmounting /Data
> Sep 11 08:49:11 node1 rgmanager[17068]: [clusterfs] Sending SIGTERM to
> processes on /Data
> Sep 11 08:49:16 node1 rgmanager[17104]: [clusterfs] unmounting /Data
> Sep 11 08:49:16 node1 rgmanager[17113]: [clusterfs] Sending SIGKILL to
> processes on /Data
> Sep 11 08:49:19 node1 kernel: dlm: closing connection to node 2
> Sep 11 08:49:19 node1 kernel: dlm: closing connection to node 1
> Sep 11 08:49:19 node1 kernel: dlm: gfs2: no userland control daemon,
> stopping lockspace
> Sep 11 08:49:22 node1 rgmanager[17149]: [clusterfs] unmounting /Data
> Sep 11 08:49:22 node1 rgmanager[17158]: [clusterfs] Sending SIGKILL to
> processes on /Data
>
>
>
> Also when I try to restart the cman service, below error comes.
> Starting cluster:
>    Checking if cluster has been disabled at boot...        [  OK  ]
>    Checking Network Manager...                             [  OK  ]
>    Global setup...                                         [  OK  ]
>    Loading kernel modules...                               [  OK  ]
>    Mounting configfs...                                    [  OK  ]
>    Starting cman...                                        [  OK  ]
>    Waiting for quorum...                                   [  OK  ]
>    Starting fenced...                                      [  OK  ]
>    Starting dlm_controld...                                [  OK  ]
>    Starting gfs_controld...                                [  OK  ]
>    Unfencing self... fence_node: cannot connect to cman
>                                                            [FAILED]
> Stopping cluster:
>    Leaving fence domain...                                 [  OK  ]
>    Stopping gfs_controld...                                [  OK  ]
>    Stopping dlm_controld...                                [  OK  ]
>    Stopping fenced...                                      [  OK  ]
>    Stopping cman...                                        [  OK  ]
>    Unloading kernel modules...                             [  OK  ]
>    Unmounting configfs...                                  [  OK  ]
>
> Thanks again.
> Renchu Mathew
> On Tue, Sep 11, 2012 at 9:10 PM, Arun Eapen CISSP, RHCA
> <arun@xxxxxxxxxx> wrote:
>
>
>
>         Put the fenced in debug mode and copy the error messages, for
>         me to
>         debug
>
>         On Tue, 2012-09-11 at 11:52 +0400, Renchu Mathew wrote:
>         > Hi Arun,
>         >
>         > I have done the RH436 course in conducted by you at Redhat
>         b'lore. How
>         > r u?
>         >
>         > I have configured a 2 node failover cluster setup (almost
>         same like
>         > our RH436 lab setup in b'lore) It is almost ok except
>         fencing. If I
>         > pull the active node network cable it is not switching to
>         the other
>         > automatically. It is getting hung. Then I have to do this
>         manually. Is
>         > there any script for creating the dummy fencing in RHCS
>         which will
>         > restart or shutdown the other node. Please find attached my
>         > cluster.conf file. is there anyway we can power fence using
>         APC UPS.
>         >
>         > Could you please help me if you get some time.
>         >
>         > Thanks and regards
>         > Renchu Mathew
>         >
>         >
>         >
>
>
>
>         --
>         Arun Eapen
>         CISSP, RHC{A,DS,E,I,SS,VA,X}
>         Senior Technical Consultant & Certification Poobah
>         Red Hat India Pvt. Ltd.,
>         No - 4/1, Bannergatta Road,
>         IBC Knowledge Park,
>         11th floor, Tower D,
>         Bangalore - 560029, INDIA.
>
>
>


--
Arun Eapen
CISSP, RHC{A,DS,E,I,SS,VA,X}
Senior Technical Consultant & Certification Poobah
Red Hat India Pvt. Ltd.,
No - 4/1, Bannergatta Road,
IBC Knowledge Park,
11th floor, Tower D,
Bangalore - 560029, INDIA.



-- 
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux