Re: information request

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Steven,
Here testing results
Iptables is stopped both end.

[root@eusipgw01 ~]# iptables -L -nv -x
Chain INPUT (policy ACCEPT 474551 packets, 178664760 bytes)
    pkts      bytes target     prot opt in     out     source               destination         

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
    pkts      bytes target     prot opt in     out     source               destination         

Chain OUTPUT (policy ACCEPT 467510 packets, 169303071 bytes)
    pkts      bytes target     prot opt in     out     source               destination         
[root@eusipgw01 ~]#


First case is udpu transport and rrp: none

totem {
        version: 2
        token: 160
        token_retransmits_before_loss_const: 3
        join: 250
        consensus: 300
        vsftype: none
        max_messages: 20
        threads: 0
        nodeid: 2
        rrp_mode: none
        interface {
                member {
                     memberaddr: 10.10.10.1
        }
                ringnumber: 0
                bindnetaddr: 10.10.10.0
                mcastport: 5405
        }
        transport: udpu
}

Error

Nov 24 14:25:29 corosync [MAIN  ] Totem is unable to form a cluster because of an operating system or network fault. The most common cause of this message is that the local firewall is configured improperly.

pbx01*CLI> corosync show members

=============================================================
=== Cluster members =========================================
=============================================================
===
===
=============================================================


And the same with rrp: passive. I think unicast is more related to some incompatibility with vmware ? Only multicast going though, bur even then it not forming completely the cluster.

Slava.


From: "Steven Dake" <sdake@xxxxxxxxxx>
To: "Slava Bendersky" <volga629@xxxxxxxxxxxxx>, "Digimer" <lists@xxxxxxxxxx>
Cc: discuss@xxxxxxxxxxxx
Sent: Sunday, November 24, 2013 12:01:09 PM
Subject: Re: information request


On 11/23/2013 11:20 PM, Slava Bendersky wrote:
Hello Digimer,
Here from asterisk box what I see
pbx01*CLI> corosync show members

=============================================================
=== Cluster members =========================================
=============================================================
===
=== Node 1
=== --> Group: asterisk
=== --> Address 1: 10.10.10.1
=== Node 2
=== --> Group: asterisk
=== --> Address 1: 10.10.10.2
===
=============================================================

[2013-11-24 01:12:43] WARNING[2057]: res_corosync.c:316 ast_event_cb: CPG mcast failed (6)
[2013-11-24 01:12:43] WARNING[2057]: res_corosync.c:316 ast_event_cb: CPG mcast failed (6)


These errors come from asterisk via the cpg libraries because corosync cannot get a proper configuration.  The first message on tihs thread contains the scenarios under which those occur.  In a past log you had the error indicating a network fault.  This network fault error IIRC indicates firewall is enabled.  The error from asterisk is expected if your firewall is enabled.  This was suggested before by Digimer, but can you confirm you totally disabled your firewall on the box (rather then just configured it as you thought was correct).

Turn off the firewall - which will help us eliminate that as a source of the problem.

Next, use UDPU mode without RRP - confirm whether that works

Next use UDPU _passive_ rrp mode - confirm whether that works

One thing at a time in each step please.

Regards
-steve


Is possible that message related to permission who running corosync or asterisk ?

And another point is when I send ping I see MAC address of eth0 which is default gateway and not cluster interface.

Corosync does not use the gateway address in any of its routing calculations.  Instead it physically binds to the interface specified as detailed in corosync.conf.5.  By physically binding, it avoids the gateway entirely.

Regards
-steve

pbx01*CLI> corosync ping
[2013-11-24 01:16:54] NOTICE[2057]: res_corosync.c:303 ast_event_cb: (ast_event_cb) Got event PING from server with EID: 'MAC address of the eth0'
[2013-11-24 01:16:54] WARNING[2057]: res_corosync.c:316 ast_event_cb: CPG mcast failed (6)


Slava.



From: "Slava Bendersky" <volga629@xxxxxxxxxxxxx>
To: "Digimer" <lists@xxxxxxxxxx>
Cc: discuss@xxxxxxxxxxxx
Sent: Sunday, November 24, 2013 12:26:40 AM
Subject: Re: information request

Hello Digimer,
I am trying find information about vmware multicast problems. But on tcpdump I see multicas traffic from remote end. I can't confirm if packet arrive as should be.
Can please confirm that memberaddr: is ip address of second node ?

06:05:02.408204 IP (tos 0x0, ttl 1, id 0, offset 0, flags [DF], proto UDP (17), length 221)
    10.10.10.1.5404 > 226.94.1.1.5405: [udp sum ok] UDP, length 193
06:05:02.894935 IP (tos 0x0, ttl 1, id 0, offset 0, flags [DF], proto UDP (17), length 221)
    10.10.10.2.5404 > 226.94.1.1.5405: [bad udp cksum 1a8c!] UDP, length 193


Slava.




From: "Digimer" <lists@xxxxxxxxxx>
To: "Slava Bendersky" <volga629@xxxxxxxxxxxxx>
Cc: discuss@xxxxxxxxxxxx
Sent: Saturday, November 23, 2013 11:54:55 PM
Subject: Re: information request

If I recall correctly, VMWare doesn't do multicast properly. I'm not
sure though, I don't use it.

Try unicast with no RRP. See if that works.

On 23/11/13 23:16, Slava Bendersky wrote:
> Hello Digimer,
> All machines are rhel 6.4 based on vmware , there not physical switch
> only from vmware. I set rrp to none and cluster is formed.
> With this config I am getting constant error messages.
>
> [root@eusipgw01 ~]# cat /etc/redhat-release
> Red Hat Enterprise Linux Server release 6.4 (Santiago)
>
> [root@eusipgw01 ~]# rpm -qa | grep corosync
> corosync-1.4.1-15.el6.x86_64
> corosynclib-1.4.1-15.el6.x86_64
>
>
> [2013-11-23 22:46:20] WARNING[2057] res_corosync.c: CPG mcast failed (6)
> [2013-11-23 22:46:20] WARNING[2057] res_corosync.c: CPG mcast failed (6)
>
> iptables
>
> -A INPUT -i eth1 -p udp -m state --state NEW -m udp --dport 5404:5407 -j
> NFLOG --nflog-prefix  "dmz_ext2fw: " --nflog-group 2
> -A INPUT -i eth1 -m pkttype --pkt-type multicast -j NFLOG
> --nflog-prefix  "dmz_ext2fw: " --nflog-group 2
> -A INPUT -i eth1 -m pkttype --pkt-type unicast -j NFLOG --nflog-prefix
> "dmz_ext2fw: " --nflog-group 2
> -A INPUT -i eth1 -p igmp -j NFLOG --nflog-prefix  "dmz_ext2fw: "
> --nflog-group 2
> -A INPUT -j ACCEPT
>
>
> ------------------------------------------------------------------------
> *From: *"Digimer" <lists@xxxxxxxxxx>
> *To: *"Slava Bendersky" <volga629@xxxxxxxxxxxxx>
> *Cc: *discuss@xxxxxxxxxxxx
> *Sent: *Saturday, November 23, 2013 10:34:00 PM
> *Subject: *Re: information request
>
> I don't think you ever said what OS you have. I've never had to set
> anything in sysctl.conf on RHEL/CentOS 6. Did you try disabling RRP
> entirely? If you have a managed switch, make sure persistent multicast
> groups are enabled or try a different switch entirely.
>
> *Something* is interrupting your network traffic. What does
> iptables-save show? Are these physical or virtual machines?
>
> The more information about your environment that you can share, the
> better we can help.
>
> On 23/11/13 22:29, Slava Bendersky wrote:
>> Hello Digimer,
>> As an idea, might be some settings in sysctl.conf ?
>>
>> Slava.
>>
>>
>> ------------------------------------------------------------------------
>> *From: *"Slava Bendersky" <volga629@xxxxxxxxxxxxx>
>> *To: *"Digimer" <lists@xxxxxxxxxx>
>> *Cc: *discuss@xxxxxxxxxxxx
>> *Sent: *Saturday, November 23, 2013 10:27:22 PM
>> *Subject: *Re: information request
>>
>> Hello Digimer,
>> Yes I set to passive and selinux is disabled
>>
>> [root@eusipgw01 ~]# sestatus
>> SELinux status:                 disabled
>> [root@eusipgw01 ~]# cat /etc/corosync/corosync.conf
>> totem {
>>         version: 2
>>         token: 160
>>         token_retransmits_before_loss_const: 3
>>         join: 250
>>         consensus: 300
>>         vsftype: none
>>         max_messages: 20
>>         threads: 0
>>         nodeid: 2
>>         rrp_mode: passive
>>         interface {
>>                 ringnumber: 0
>>                 bindnetaddr: 10.10.10.0
>>                 mcastaddr: 226.94.1.1
>>                 mcastport: 5405
>>         }
>> }
>>
>> logging {
>>         fileline: off
>>         to_stderr: yes
>>         to_logfile: yes
>>         to_syslog: off
>>         logfile: /var/log/cluster/corosync.log
>>         debug: off
>>         timestamp: on
>>         logger_subsys {
>>                 subsys: AMF
>>                 debug: off
>>         }
>> }
>>
>>
>> Slava.
>>
>> ------------------------------------------------------------------------
>> *From: *"Digimer" <lists@xxxxxxxxxx>
>> *To: *"Slava Bendersky" <volga629@xxxxxxxxxxxxx>
>> *Cc: *"Steven Dake" <sdake@xxxxxxxxxx>, discuss@xxxxxxxxxxxx
>> *Sent: *Saturday, November 23, 2013 7:04:43 PM
>> *Subject: *Re: information request
>>
>> First up, I'm not Steven. Secondly, did you follow Steven's
>> recommendation to not use active RRP? Does the cluster form with no RRP
>> at all? Is selinux enabled?
>>
>> On 23/11/13 18:29, Slava Bendersky wrote:
>>> Hello Steven,
>>> In multicast it log filling with this message
>>>
>>> Nov 24 00:26:28 corosync [TOTEM ] A processor failed, forming new
>>> configuration.
>>> Nov 24 00:26:28 corosync [TOTEM ] A processor joined or left the
>>> membership and a new membership was formed.
>>> Nov 24 00:26:31 corosync [CPG   ] chosen downlist: sender r(0)
>>> ip(10.10.10.1) ; members(old:2 left:0)
>>> Nov 24 00:26:31 corosync [MAIN  ] Completed service synchronization,
>>> ready to provide service.
>>>
>>> In uudp it not working at all.
>>>
>>> Slava.
>>>
>>>
>>> ------------------------------------------------------------------------
>>> *From: *"Digimer" <lists@xxxxxxxxxx>
>>> *To: *"Slava Bendersky" <volga629@xxxxxxxxxxxxx>
>>> *Cc: *"Steven Dake" <sdake@xxxxxxxxxx>, discuss@xxxxxxxxxxxx
>>> *Sent: *Saturday, November 23, 2013 6:05:56 PM
>>> *Subject: *Re: information request
>>>
>>> So multicast works with the firewall disabled?
>>>
>>> On 23/11/13 17:28, Slava Bendersky wrote:
>>>> Hello Steven,
>>>> I disabled iptables and no difference, error message the same, but at
>>>> least in multicast is wasn't generate the error.
>>>>
>>>>
>>>> Slava.
>>>>
>>>> ------------------------------------------------------------------------
>>>> *From: *"Digimer" <lists@xxxxxxxxxx>
>>>> *To: *"Slava Bendersky" <volga629@xxxxxxxxxxxxx>, "Steven Dake"
>>>> <sdake@xxxxxxxxxx>
>>>> *Cc: *discuss@xxxxxxxxxxxx
>>>> *Sent: *Saturday, November 23, 2013 4:37:36 PM
>>>> *Subject: *Re: [corosync] information request
>>>>
>>>> Does either mcast or unicast work if you disable the firewall? If so,
>>>> then at least you know for sure that iptables is the problem.
>>>>
>>>> The link here shows the iptables rules I use (for corosync in mcast and
>>>> other apps):
>>>>
>>>> https://alteeve.ca/w/AN!Cluster_Tutorial_2#Configuring_iptables
>>>>
>>>> digimer
>>>>
>>>> On 23/11/13 16:12, Slava Bendersky wrote:
>>>>> Hello Steven,
>>>>> Than  what I see when setup through UDPU
>>>>>
>>>>> Nov 23 22:08:13 corosync [MAIN  ] Compatibility mode set to whitetank.
>>>>> Using V1 and V2 of the synchronization engine.
>>>>> Nov 23 22:08:13 corosync [TOTEM ] adding new UDPU member {10.10.10.1}
>>>>> Nov 23 22:08:16 corosync [MAIN  ] Totem is unable to form a cluster
>>>>> because of an operating system or network fault. The most common cause
>>>>> of this message is that the local firewall is configured improperly.
>>>>>
>>>>>
>>>>> Might be missing some firewall rules ? I allowed unicast.
>>>>>
>>>>> Slava.
>>>>>
>>>>>
> ------------------------------------------------------------------------
>>>>> *From: *"Steven Dake" <sdake@xxxxxxxxxx>
>>>>> *To: *"Slava Bendersky" <volga629@xxxxxxxxxxxxx>
>>>>> *Cc: *discuss@xxxxxxxxxxxx
>>>>> *Sent: *Saturday, November 23, 2013 10:33:31 AM
>>>>> *Subject: *Re: information request
>>>>>
>>>>>
>>>>> On 11/23/2013 08:23 AM, Slava Bendersky wrote:
>>>>>
>>>>>     Hello Steven,
>>>>>
>>>>>     My setup
>>>>>
>>>>>     10.10.10.1 primary server -----EoIP tunnel vpn ipsec ----- dr
> server
>>>>>     10.10.10.2
>>>>>
>>>>>     On both servers is 2 interfaces eth0 which default gw out and eth1
>>>>>     where corosync live.
>>>>>
>>>>>     Iptables:
>>>>>
>>>>>     -A INPUT -i eth1 -p udp -m state --state NEW -m udp --dport
>> 5404:5407
>>>>>     -A INPUT -i eth1 -m pkttype --pkt-type multicast
>>>>>     -A INPUT -i eth1 -p igmp
>>>>>
>>>>>
>>>>>     Corosync.conf
>>>>>
>>>>>     totem {
>>>>>             version: 2
>>>>>             token: 160
>>>>>             token_retransmits_before_loss_const: 3
>>>>>             join: 250
>>>>>             consensus: 300
>>>>>             vsftype: none
>>>>>             max_messages: 20
>>>>>             threads: 0
>>>>>             nodeid: 2
>>>>>             rrp_mode: active
>>>>>             interface {
>>>>>                     ringnumber: 0
>>>>>                     bindnetaddr: 10.10.10.0
>>>>>                     mcastaddr: 226.94.1.1
>>>>>                     mcastport: 5405
>>>>>             }
>>>>>     }
>>>>>
>>>>>     Join message
>>>>>
>>>>>     [root@eusipgw01 ~]# corosync-objctl | grep member
>>>>>     runtime.totem.pg.mrp.srp.members.2.ip=r(0) ip(10.10.10.2)
>>>>>     runtime.totem.pg.mrp.srp.members.2.join_count=1
>>>>>     runtime.totem.pg.mrp.srp.members.2.status=joined
>>>>>     runtime.totem.pg.mrp.srp.members.1.ip=r(0) ip(10.10.10.1)
>>>>>     runtime.totem.pg.mrp.srp.members.1.join_count=254
>>>>>     runtime.totem.pg.mrp.srp.members.1.status=joined
>>>>>
>>>>>     Is it possible that ping sends out of wrong interface ?
>>>>>
>>>>> Slava,
>>>>>
>>>>> I wouldn't expect so.
>>>>>
>>>>> Which version?
>>>>>
>>>>> Have you tried udpu instead?  If not, it is preferable to multicast
>>>>> unless you want absolute performance on cpg groups.  In most cases the
>>>>> performance difference is very small and not worth the trouble of
>>>>> setting up multicast in your network.
>>>>>
>>>>> Fabio had indicated rrp active mode is broken.  I don't know the
>>>>> details, but try passive RRP - it is actually better then active
>>> IMNSHO :)
>>>>>
>>>>> Regards
>>>>> -steve
>>>>>
>>>>>     Slava.
>>>>>
>>>>>    
>>>> ------------------------------------------------------------------------
>>>>>     *From: *"Steven Dake" <sdake@xxxxxxxxxx>
>>>>>     *To: *"Slava Bendersky" <volga629@xxxxxxxxxxxxx>,
>>> discuss@xxxxxxxxxxxx
>>>>>     *Sent: *Saturday, November 23, 2013 6:01:11 AM
>>>>>     *Subject: *Re: information request
>>>>>
>>>>>
>>>>>     On 11/23/2013 12:29 AM, Slava Bendersky wrote:
>>>>>
>>>>>         Hello Everyone,
>>>>>         Corosync run on box  with 2 Ethernet interfaces.
>>>>>         I am getting this message
>>>>>         CPG mcast failed (6)
>>>>>
>>>>>         Any information thank you in advance.
>>>>>
>>>>>
>>>>>
>>>>>    
>>>>
>>>
>>
> https://github.com/corosync/corosync/blob/master/include/corosync/corotypes.h#L84
>>>>>
>>>>>     This can occur because:
>>>>>     a) firewall is enabled - there should be something in the logs
>>>>>     telling you to properly configure the firewall
>>>>>     b) a config change is in progress - this is a normal response, and
>>>>>     you should try the request again
>>>>>     c) a bug in the synchronization code is resulting in a blocked
>>>>>     unsynced cluster
>>>>>
>>>>>     c is very unlikely at this point.
>>>>>
>>>>>     2 ethernet interfaces = rrp mode, bonding, or something else?
>>>>>
>>>>>     Digimer needs moar infos :)
>>>>>
>>>>>     Regards
>>>>>     -steve
>>>>>
>>>>>
>>>>>
>>>>>         _______________________________________________
>>>>>         discuss mailing list
>>>>>         discuss@xxxxxxxxxxxx
>>>>>         http://lists.corosync.org/mailman/listinfo/discuss
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> discuss mailing list
>>>>> discuss@xxxxxxxxxxxx
>>>>> http://lists.corosync.org/mailman/listinfo/discuss
>>>>>
>>>>
>>>>
>>>> --
>>>> Digimer
>>>> Papers and Projects: https://alteeve.ca/w/
>>>> What if the cure for cancer is trapped in the mind of a person without
>>>> access to education?
>>>>
>>>
>>>
>>> --
>>> Digimer
>>> Papers and Projects: https://alteeve.ca/w/
>>> What if the cure for cancer is trapped in the mind of a person without
>>> access to education?
>>>
>>
>>
>> --
>> Digimer
>> Papers and Projects: https://alteeve.ca/w/
>> What if the cure for cancer is trapped in the mind of a person without
>> access to education?
>>
>>
>> _______________________________________________
>> discuss mailing list
>> discuss@xxxxxxxxxxxx
>> http://lists.corosync.org/mailman/listinfo/discuss
>>
>
>
> --
> Digimer
> Papers and Projects: https://alteeve.ca/w/
> What if the cure for cancer is trapped in the mind of a person without
> access to education?
>


--
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?


_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss



_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss


_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss

[Index of Archives]     [Linux Clusters]     [Corosync Project]     [Linux USB Devel]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Linux Kernel]     [Linux SCSI]     [X.Org]

  Powered by Linux