Re: Corosync instances seems to ignore each other despite many UDP chat without firewall

David Guyot <david.guyot@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx> · Wed, 06 Jun 2012 17:28:34 +0200

Hello again, everybody.

I just noticed that, when I tried to set secauth to off, during the
period of time in which one node accepted secured connections one the
other unsecured connections, the network fault message were replaced by
these :
Jun 06 17:16:17 corosync [TOTEM ] Received message has invalid digest...
ignoring.
Jun 06 17:16:17 corosync [TOTEM ] Invalid packet data
Jun 06 17:16:17 corosync [TOTEM ] Received message has invalid digest...
ignoring.
Jun 06 17:16:17 corosync [TOTEM ] Invalid packet data
Jun 06 17:16:17 corosync [TOTEM ] Received message has invalid digest...
ignoring.
Jun 06 17:16:17 corosync [TOTEM ] Invalid packet data
Jun 06 17:16:17 corosync [TOTEM ] Received message has invalid digest...
ignoring.
Jun 06 17:16:17 corosync [TOTEM ] Invalid packet data
Jun 06 17:16:17 corosync [TOTEM ] Received message has invalid digest...
ignoring.
Jun 06 17:16:17 corosync [TOTEM ] Invalid packet data
Jun 06 17:16:17 corosync [TOTEM ] Received message has invalid digest...
ignoring.
Jun 06 17:16:17 corosync [TOTEM ] Invalid packet data
Jun 06 17:16:17 corosync [TOTEM ] Received message has invalid digest...
ignoring.
Jun 06 17:16:17 corosync [TOTEM ] Invalid packet data
Jun 06 17:16:18 corosync [TOTEM ] Received message has invalid digest...
ignoring.
Jun 06 17:16:18 corosync [TOTEM ] Invalid packet data
Jun 06 17:16:18 corosync [TOTEM ] Received message has invalid digest...
ignoring.
Jun 06 17:16:18 corosync [TOTEM ] Invalid packet data
Jun 06 17:16:18 corosync [TOTEM ] Received message has invalid digest...
ignoring.
Jun 06 17:16:18 corosync [TOTEM ] Invalid packet data
Jun 06 17:16:18 corosync [TOTEM ] Received message has invalid digest...
ignoring.
Jun 06 17:16:18 corosync [TOTEM ] Invalid packet data
Jun 06 17:16:18 corosync [TOTEM ] Received message has invalid digest...
ignoring.
Jun 06 17:16:18 corosync [TOTEM ] Invalid packet data
Jun 06 17:16:18 corosync [TOTEM ] Received message has invalid digest...
ignoring.
Jun 06 17:16:18 corosync [TOTEM ] Invalid packet data
Jun 06 17:16:18 corosync [TOTEM ] Received message has invalid digest...
ignoring.
Jun 06 17:16:18 corosync [TOTEM ] Invalid packet data
Jun 06 17:16:18 corosync [TOTEM ] Received message has invalid digest...
ignoring.
Jun 06 17:16:18 corosync [TOTEM ] Invalid packet data
Jun 06 17:16:18 corosync [TOTEM ] Received message has invalid digest...
ignoring.
Jun 06 17:16:18 corosync [TOTEM ] Invalid packet data
Jun 06 17:16:18 corosync [TOTEM ] Received message has invalid digest...
ignoring.
Jun 06 17:16:18 corosync [TOTEM ] Invalid packet data
Jun 06 17:16:18 corosync [TOTEM ] Received message has invalid digest...
ignoring.
Jun 06 17:16:18 corosync [TOTEM ] Invalid packet data
Jun 06 17:16:18 corosync [TOTEM ] Received message has invalid digest...
ignoring.
Jun 06 17:16:18 corosync [TOTEM ] Invalid packet data
Jun 06 17:16:18 corosync [TOTEM ] Received message has invalid digest...
ignoring.
Jun 06 17:16:18 corosync [TOTEM ] Invalid packet data

If this is relevant...

Thank you in advance.

Regards.

Le 06/06/2012 17:05, David Guyot a écrit :
> Hello, everybody.
>
> I'm trying to establish a 2-node Debian Squeeze x64 cluster with
> Corosync and Pacemaker, but I'm hanged with a strange issue : despite a
> lot of UDP chatting between the nodes (so network is OK but), each
> Corosync instance seems to ignore each other : the other node is never
> detected, and crm_mon --one-shot -V only says "Connection to cluster
> failed: connection failed". But the strangest in there is that both
> Corosync nodes are filling their logs with error messages saying "Totem
> is unable to form a cluster because of an operating system or network
> fault. The most common cause of this message is that the local firewall
> is configured improperly.". I tcpdumped all traffic between the hosts,
> and I have 2-way traffic between them. I tried to use backports versions
> of all Corosync- and Pacemaker-related packages, without improvement.
>
> I must add that, due to my hosting company network policy, I was forced
> to use UPD-Unicast instead of multicast, because multicast is blocked.
>
> Here comes my config :
> corosync.conf :
> # Please read the corosync.conf.5 manual page
> compatibility: whitetank
>
> totem {
>         version: 2
>         secauth: on
>         interface {
>                 member {
>                         memberaddr: 176.31.238.131
>                 }
>                 ringnumber: 0
>                 bindnetaddr: 37.59.18.208
>                 mcastport: 5405
>                 ttl: 1
>         }
>         transport: udpu
> }
>
> logging {
>         fileline: off
>         to_logfile: yes
>         to_syslog: yes
>         debug: on
>         logfile: /var/log/corosync.log
>         debug: off
>         timestamp: on
>         logger_subsys {
>                 subsys: AMF
>                 debug: off
>         }
> }
>
> Log messages :
> Jun 06 16:35:14 corosync [MAIN  ] Corosync Cluster Engine ('1.4.2'):
> started and ready to provide service.
> Jun 06 16:35:14 corosync [MAIN  ] Corosync built-in features: nss
> Jun 06 16:35:14 corosync [MAIN  ] Successfully read main configuration
> file '/etc/corosync/corosync.conf'.
> Jun 06 16:35:14 corosync [TOTEM ] Initializing transport (UDP/IP Unicast).
> Jun 06 16:35:14 corosync [TOTEM ] Initializing transmit/receive
> security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
> Jun 06 16:35:14 corosync [TOTEM ] The network interface [37.59.18.208]
> is now up.
> Jun 06 16:35:14 corosync [SERV  ] Service engine loaded: corosync
> extended virtual synchrony service
> Jun 06 16:35:14 corosync [SERV  ] Service engine loaded: corosync
> configuration service
> Jun 06 16:35:14 corosync [SERV  ] Service engine loaded: corosync
> cluster closed process group service v1.01
> Jun 06 16:35:14 corosync [SERV  ] Service engine loaded: corosync
> cluster config database access v1.01
> Jun 06 16:35:14 corosync [SERV  ] Service engine loaded: corosync
> profile loading service
> Jun 06 16:35:14 corosync [SERV  ] Service engine loaded: corosync
> cluster quorum service v0.1
> Jun 06 16:35:14 corosync [MAIN  ] Compatibility mode set to whitetank. 
> Using V1 and V2 of the synchronization engine.
> Jun 06 16:35:23 corosync [TOTEM ] Totem is unable to form a cluster
> because of an operating system or network fault. The most common cause
> of this message is that the local firewall is configured improperly.
> Jun 06 16:35:25 corosync [TOTEM ] Totem is unable to form a cluster
> because of an operating system or network fault. The most common cause
> of this message is that the local firewall is configured improperly.
> Jun 06 16:35:27 corosync [TOTEM ] Totem is unable to form a cluster
> because of an operating system or network fault. The most common cause
> of this message is that the local firewall is configured improperly.
> Jun 06 16:35:30 corosync [TOTEM ] Totem is unable to form a cluster
> because of an operating system or network fault. The most common cause
> of this message is that the local firewall is configured improperly.
>
> # uname -a
> Linux Vindemiatrix 3.2.13-grsec-xxxx-grs-ipv6-64 #1 SMP Thu Mar 29
> 09:48:59 UTC 2012 x86_64 GNU/Linux
>
> # iptables -nvL
> Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
>  pkts bytes target     prot opt in     out     source              
> destination        
>     0     0 ACCEPT     all  --  tun0   *       0.0.0.0/0           
> 0.0.0.0/0          
>     0     0 ACCEPT     all  --  lo     *       0.0.0.0/0           
> 0.0.0.0/0          
>     0     0            tcp  --  *      *       0.0.0.0/0           
> 0.0.0.0/0           tcp dpt:22 state NEW recent: SET name: SSH side: source
>     0     0 LOGDROP    tcp  --  *      *       0.0.0.0/0           
> 0.0.0.0/0           tcp dpt:22 state NEW recent: UPDATE seconds: 60
> hit_count: 6 TTL-Match name: SSH side: source
>     0     0 ACCEPT     tcp  --  *      *       0.0.0.0/0           
> 0.0.0.0/0           tcp dpt:22 state NEW
>     0     0 LOGDROP    tcp  --  *      *       0.0.0.0/0           
> 0.0.0.0/0           tcp flags:0x17/0x02 multiport dports 80,443 #conn/32
>> 100
>     1    48 ACCEPT     tcp  --  *      *       0.0.0.0/0           
> 0.0.0.0/0           tcp flags:0x17/0x02 multiport dports 80,443
>     0     0 ACCEPT     tcp  --  eth0   *       0.0.0.0/0           
> 0.0.0.0/0           tcp dpt:21 flags:0x17/0x02 limit: avg 5/min burst 50
> recent: SET name: FTP side: source
>     0     0 LOGDROP    tcp  --  eth0   *       0.0.0.0/0           
> 0.0.0.0/0           tcp dpt:21 flags:0x17/0x02 recent: UPDATE seconds:
> 60 hit_count: 6 TTL-Match name: FTP side: source
>     0     0 ACCEPT     tcp  --  eth0   *       0.0.0.0/0           
> 0.0.0.0/0           tcp dpt:21 flags:0x17/0x02
>     0     0 ACCEPT     tcp  --  eth0   *       0.0.0.0/0           
> 0.0.0.0/0           tcp dpts:50000:50500 state RELATED,ESTABLISHED
>     0     0 ACCEPT     tcp  --  eth0   *       176.31.238.131      
> 0.0.0.0/0           tcp dpt:1194
> 11867 3145K ACCEPT     udp  --  *      *       0.0.0.0/0           
> 0.0.0.0/0           udp dpt:5405 /* Corosync */
>    35  9516 ACCEPT     all  --  eth0   *       0.0.0.0/0           
> 0.0.0.0/0           state NEW limit: avg 30/sec burst 200
>     0     0 LOGDROP    tcp  --  eth0   *       0.0.0.0/0           
> 0.0.0.0/0           tcp dpt:80 STRING match "w00tw00t.at.ISC.SANS." ALGO
> name bm TO 65535
>     0     0 ACCEPT     icmp --  *      *       0.0.0.0/0           
> 0.0.0.0/0           limit: avg 10/sec burst 5
>     0     0 LOGDROP    icmp --  *      *       0.0.0.0/0           
> 0.0.0.0/0          
>  1031 70356 ACCEPT     all  --  *      *       0.0.0.0/0           
> 0.0.0.0/0           state RELATED,ESTABLISHED
>     3   132 LOGDROP    all  --  *      *       0.0.0.0/0           
> 0.0.0.0/0          
>
> Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
>  pkts bytes target     prot opt in     out     source              
> destination        
>     0     0 LOGDROP    all  --  *      *       0.0.0.0/0           
> 0.0.0.0/0          
>
> Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
>  pkts bytes target     prot opt in     out     source              
> destination        
>     0     0 ACCEPT     all  --  *      tun0    0.0.0.0/0           
> 0.0.0.0/0          
>     0     0 ACCEPT     all  --  *      lo      0.0.0.0/0           
> 0.0.0.0/0          
>     0     0 LOGDROP    tcp  --  *      eth0    0.0.0.0/0           
> 0.0.0.0/0           tcp dpt:80 owner UID match 33
>     0     0 LOGDROP    udp  --  *      eth0    0.0.0.0/0           
> 0.0.0.0/0           udp dpt:80 owner UID match 33
>     0     0 LOGDROP    tcp  --  *      eth0    0.0.0.0/0           
> 0.0.0.0/0           tcp dpt:443 owner UID match 33
>     0     0 LOGDROP    udp  --  *      eth0    0.0.0.0/0           
> 0.0.0.0/0           udp dpt:443 owner UID match 33
>     0     0 ACCEPT     tcp  --  *      eth0    0.0.0.0/0           
> 176.31.238.131      tcp dpt:1194
> 11871 3146K ACCEPT     udp  --  *      *       0.0.0.0/0           
> 0.0.0.0/0           udp dpt:5405 /* Corosync */
>     0     0 ACCEPT     tcp  --  *      *       0.0.0.0/0           
> 0.0.0.0/0           tcp dpt:22
>     0     0 ACCEPT     tcp  --  *      *       0.0.0.0/0           
> 0.0.0.0/0           tcp dpt:25
>     0     0 ACCEPT     tcp  --  *      eth0    0.0.0.0/0           
> 0.0.0.0/0           tcp dpt:43
>     0     0 ACCEPT     tcp  --  *      eth0    0.0.0.0/0           
> 0.0.0.0/0           tcp dpt:53
>     0     0 ACCEPT     udp  --  *      eth0    0.0.0.0/0           
> 0.0.0.0/0           udp dpt:53
>     0     0 ACCEPT     tcp  --  *      eth0    0.0.0.0/0           
> 0.0.0.0/0           tcp dpt:80
>     0     0 ACCEPT     udp  --  *      eth0    0.0.0.0/0           
> 0.0.0.0/0           udp dpt:123
>     0     0 ACCEPT     tcp  --  *      eth0    0.0.0.0/0           
> 0.0.0.0/0           tcp dpt:443
>     0     0 ACCEPT     tcp  --  *      eth0    0.0.0.0/0           
> 0.0.0.0/0           tcp dpt:873
>    11   924 ACCEPT     icmp --  *      *       0.0.0.0/0           
> 0.0.0.0/0          
>  1071  712K ACCEPT     all  --  *      *       0.0.0.0/0           
> 0.0.0.0/0           state RELATED,ESTABLISHED
>    67 14013 LOGDROP    all  --  *      *       0.0.0.0/0           
> 0.0.0.0/0          
>
> Chain LOGDROP (12 references)
>  pkts bytes target     prot opt in     out     source              
> destination        
>    57 11655 LOG        all  --  *      *       0.0.0.0/0           
> 0.0.0.0/0           limit: avg 1/sec burst 5 LOG flags 0 level 5 prefix
> `iptables rejected: '
>    70 14145 DROP       all  --  *      *       0.0.0.0/0           
> 0.0.0.0/0      
>
> # corosync -v
> Corosync Cluster Engine, version '1.4.2'
> Copyright (c) 2006-2009 Red Hat, Inc.
>
> I've been trying to solve this problem the 2 last days, without any
> result. Any help welcome.
>
> Thank ou in advance!
>
> Regards.
>

Attachment:
signature.asc

Description: OpenPGP digital signature
_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss