I always used "tcpdump -ni bond1 port 5405" to check if both nodes are involved in the comunication, if isn't like that, that would say is multicast problem 2014-06-12 16:43 GMT+02:00 Kaloyan Kovachev <kkovachev@xxxxxxxxx>: > Do you have a different auth key on each node by any chance? > > > On 2014-06-12 17:29, Arun G Nair wrote: > >> We have multicast enabled on the switch. I've also tried the multicast.py >> tool from RH's knowledge base to test multicast and I see the expected >> output, though the tool uses a different multicast IP( guess that shouldn't >> matter). I've tried increasing the post_join_delay to 360 seconds to give me >> enough time to check everything on both the nodes. One node still gets >> fenced. `clustat` output says the other node is offline on both servers. So >> one node can't see the other one ? This again points to issue with >> multicast. Any other clues as to what/where to look ? >> >> On Wed, Jun 11, 2014 at 8:33 PM, Digimer <lists@xxxxxxxxxx> wrote: >> >> On 11/06/14 10:48 AM, Arun G Nair wrote: >> Hello, >> >> What are the reasons for fence loops when only cman is started ? We >> have an RHEL 6.5 2-node cluster which goes in to a fence loop and every >> time we start cman on both nodes. Either one fences the other. Multicast >> seems to be working properly. My understanding is that without rgmanager >> running there won't be a multicast group subscription ? I don't see the >> multicast address in 'netstat -g' unless rgmanager is running. I've >> tried to increase the fence post_join_delay but one of the nodes still >> gets fenced. >> >> The cluster works fine if we use unicast UDP. >> >> Thanks, Hi, >> >> When cman starts, it waits post_join_delay seconds for the peer to >> connect. If, after that time expires (6 seconds by default, iirc), it gives >> up and calls a fence against the peer to put it into a known state. >> >> Corosync is what determines membership, and it is started by cman. The >> rgmanager only handles resource start/stop/relocate/recovery and has nothing >> to do with fencing directly. Corosync is what uses multicast. >> >> So as you seem to have already surmised, multicast is probably not working >> in your environment. Have you enabled multicast traffic on the firewall? Do >> your switches support multicast properly? >> >> digimer >> >> -- >> Digimer >> Papers and Projects: https://alteeve.ca/w/ [1] >> >> What if the cure for cancer is trapped in the mind of a person without >> access to education? >> >> -- >> Linux-cluster mailing list >> Linux-cluster@xxxxxxxxxx >> https://www.redhat.com/mailman/listinfo/linux-cluster [2] > > > -- > Arun G Nair > Sr. Sysadmin > Dimension Data | Ph: (800) 664-9973 > Feedback? We're listening [3] > > > > Links: > ------ > [1] https://alteeve.ca/w/ > [2] https://www.redhat.com/mailman/listinfo/linux-cluster > [3] http://www.surveymonkey.com/s/XRCYXBH > > > -- > Linux-cluster mailing list > Linux-cluster@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/linux-cluster -- esta es mi vida e me la vivo hasta que dios quiera -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster