Re: Corosync instances seems to ignore each other despite many UDP chat without firewall

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



David,
there seems to be one issue and at least two possibilities of another problem:
1.) (this is issue)

Config look like:
...
             member {
                          memberaddr: 176.31.238.131
                  }
                  ringnumber: 0
                  bindnetaddr: 37.59.18.208
...

There must be at least all members (including processor local ip), so let's say you have:
node 1 (176.31.238.131)
node 2 (37.59.18.208)

you must have:
member {
                          memberaddr: 176.31.238.131
            }
member {
                          memberaddr: 37.59.18.208
            }
2.) Firewall. Even it looks ok, just make sure that you have opened everything corosync need, what is:
- listening on port 5409
- ability to send to 5409
- ability to send from any port (there is basically port per udpu member, and port number is allocated by kernel). I've committed patch which binds that socket to concrete IP, for older version (currently anything !master) there is sender 0.0.0.0.

3.)
176.31.238.131 and 37.59.18.208 doesn't seems to be on same network. There may be problem with router between this nets which may block traffic.

But as a first thing, add member addr.

Regards,
  Honza

David Guyot napsal(a):
Hello, everybody.

I'm trying to establish a 2-node Debian Squeeze x64 cluster with
Corosync and Pacemaker, but I'm hanged with a strange issue : despite a
lot of UDP chatting between the nodes (so network is OK but), each
Corosync instance seems to ignore each other : the other node is never
detected, and crm_mon --one-shot -V only says "Connection to cluster
failed: connection failed". But the strangest in there is that both
Corosync nodes are filling their logs with error messages saying "Totem
is unable to form a cluster because of an operating system or network
fault. The most common cause of this message is that the local firewall
is configured improperly.". I tcpdumped all traffic between the hosts,
and I have 2-way traffic between them. I tried to use backports versions
of all Corosync- and Pacemaker-related packages, without improvement.

I must add that, due to my hosting company network policy, I was forced
to use UPD-Unicast instead of multicast, because multicast is blocked.

Here comes my config :
corosync.conf :
# Please read the corosync.conf.5 manual page
compatibility: whitetank

totem {
         version: 2
         secauth: on
         interface {
                 member {
                         memberaddr: 176.31.238.131
                 }
                 ringnumber: 0
                 bindnetaddr: 37.59.18.208
                 mcastport: 5405
                 ttl: 1
         }
         transport: udpu
}

logging {
         fileline: off
         to_logfile: yes
         to_syslog: yes
         debug: on
         logfile: /var/log/corosync.log
         debug: off
         timestamp: on
         logger_subsys {
                 subsys: AMF
                 debug: off
         }
}

Log messages :
Jun 06 16:35:14 corosync [MAIN  ] Corosync Cluster Engine ('1.4.2'):
started and ready to provide service.
Jun 06 16:35:14 corosync [MAIN  ] Corosync built-in features: nss
Jun 06 16:35:14 corosync [MAIN  ] Successfully read main configuration
file '/etc/corosync/corosync.conf'.
Jun 06 16:35:14 corosync [TOTEM ] Initializing transport (UDP/IP Unicast).
Jun 06 16:35:14 corosync [TOTEM ] Initializing transmit/receive
security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
Jun 06 16:35:14 corosync [TOTEM ] The network interface [37.59.18.208]
is now up.
Jun 06 16:35:14 corosync [SERV  ] Service engine loaded: corosync
extended virtual synchrony service
Jun 06 16:35:14 corosync [SERV  ] Service engine loaded: corosync
configuration service
Jun 06 16:35:14 corosync [SERV  ] Service engine loaded: corosync
cluster closed process group service v1.01
Jun 06 16:35:14 corosync [SERV  ] Service engine loaded: corosync
cluster config database access v1.01
Jun 06 16:35:14 corosync [SERV  ] Service engine loaded: corosync
profile loading service
Jun 06 16:35:14 corosync [SERV  ] Service engine loaded: corosync
cluster quorum service v0.1
Jun 06 16:35:14 corosync [MAIN  ] Compatibility mode set to whitetank.
Using V1 and V2 of the synchronization engine.
Jun 06 16:35:23 corosync [TOTEM ] Totem is unable to form a cluster
because of an operating system or network fault. The most common cause
of this message is that the local firewall is configured improperly.
Jun 06 16:35:25 corosync [TOTEM ] Totem is unable to form a cluster
because of an operating system or network fault. The most common cause
of this message is that the local firewall is configured improperly.
Jun 06 16:35:27 corosync [TOTEM ] Totem is unable to form a cluster
because of an operating system or network fault. The most common cause
of this message is that the local firewall is configured improperly.
Jun 06 16:35:30 corosync [TOTEM ] Totem is unable to form a cluster
because of an operating system or network fault. The most common cause
of this message is that the local firewall is configured improperly.

# uname -a
Linux Vindemiatrix 3.2.13-grsec-xxxx-grs-ipv6-64 #1 SMP Thu Mar 29
09:48:59 UTC 2012 x86_64 GNU/Linux

# iptables -nvL
Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
  pkts bytes target     prot opt in     out     source
destination
     0     0 ACCEPT     all  --  tun0   *       0.0.0.0/0
0.0.0.0/0
     0     0 ACCEPT     all  --  lo     *       0.0.0.0/0
0.0.0.0/0
     0     0            tcp  --  *      *       0.0.0.0/0
0.0.0.0/0           tcp dpt:22 state NEW recent: SET name: SSH side: source
     0     0 LOGDROP    tcp  --  *      *       0.0.0.0/0
0.0.0.0/0           tcp dpt:22 state NEW recent: UPDATE seconds: 60
hit_count: 6 TTL-Match name: SSH side: source
     0     0 ACCEPT     tcp  --  *      *       0.0.0.0/0
0.0.0.0/0           tcp dpt:22 state NEW
     0     0 LOGDROP    tcp  --  *      *       0.0.0.0/0
0.0.0.0/0           tcp flags:0x17/0x02 multiport dports 80,443 #conn/32
100
     1    48 ACCEPT     tcp  --  *      *       0.0.0.0/0
0.0.0.0/0           tcp flags:0x17/0x02 multiport dports 80,443
     0     0 ACCEPT     tcp  --  eth0   *       0.0.0.0/0
0.0.0.0/0           tcp dpt:21 flags:0x17/0x02 limit: avg 5/min burst 50
recent: SET name: FTP side: source
     0     0 LOGDROP    tcp  --  eth0   *       0.0.0.0/0
0.0.0.0/0           tcp dpt:21 flags:0x17/0x02 recent: UPDATE seconds:
60 hit_count: 6 TTL-Match name: FTP side: source
     0     0 ACCEPT     tcp  --  eth0   *       0.0.0.0/0
0.0.0.0/0           tcp dpt:21 flags:0x17/0x02
     0     0 ACCEPT     tcp  --  eth0   *       0.0.0.0/0
0.0.0.0/0           tcp dpts:50000:50500 state RELATED,ESTABLISHED
     0     0 ACCEPT     tcp  --  eth0   *       176.31.238.131
0.0.0.0/0           tcp dpt:1194
11867 3145K ACCEPT     udp  --  *      *       0.0.0.0/0
0.0.0.0/0           udp dpt:5405 /* Corosync */
    35  9516 ACCEPT     all  --  eth0   *       0.0.0.0/0
0.0.0.0/0           state NEW limit: avg 30/sec burst 200
     0     0 LOGDROP    tcp  --  eth0   *       0.0.0.0/0
0.0.0.0/0           tcp dpt:80 STRING match "w00tw00t.at.ISC.SANS." ALGO
name bm TO 65535
     0     0 ACCEPT     icmp --  *      *       0.0.0.0/0
0.0.0.0/0           limit: avg 10/sec burst 5
     0     0 LOGDROP    icmp --  *      *       0.0.0.0/0
0.0.0.0/0
  1031 70356 ACCEPT     all  --  *      *       0.0.0.0/0
0.0.0.0/0           state RELATED,ESTABLISHED
     3   132 LOGDROP    all  --  *      *       0.0.0.0/0
0.0.0.0/0

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
  pkts bytes target     prot opt in     out     source
destination
     0     0 LOGDROP    all  --  *      *       0.0.0.0/0
0.0.0.0/0

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
  pkts bytes target     prot opt in     out     source
destination
     0     0 ACCEPT     all  --  *      tun0    0.0.0.0/0
0.0.0.0/0
     0     0 ACCEPT     all  --  *      lo      0.0.0.0/0
0.0.0.0/0
     0     0 LOGDROP    tcp  --  *      eth0    0.0.0.0/0
0.0.0.0/0           tcp dpt:80 owner UID match 33
     0     0 LOGDROP    udp  --  *      eth0    0.0.0.0/0
0.0.0.0/0           udp dpt:80 owner UID match 33
     0     0 LOGDROP    tcp  --  *      eth0    0.0.0.0/0
0.0.0.0/0           tcp dpt:443 owner UID match 33
     0     0 LOGDROP    udp  --  *      eth0    0.0.0.0/0
0.0.0.0/0           udp dpt:443 owner UID match 33
     0     0 ACCEPT     tcp  --  *      eth0    0.0.0.0/0
176.31.238.131      tcp dpt:1194
11871 3146K ACCEPT     udp  --  *      *       0.0.0.0/0
0.0.0.0/0           udp dpt:5405 /* Corosync */
     0     0 ACCEPT     tcp  --  *      *       0.0.0.0/0
0.0.0.0/0           tcp dpt:22
     0     0 ACCEPT     tcp  --  *      *       0.0.0.0/0
0.0.0.0/0           tcp dpt:25
     0     0 ACCEPT     tcp  --  *      eth0    0.0.0.0/0
0.0.0.0/0           tcp dpt:43
     0     0 ACCEPT     tcp  --  *      eth0    0.0.0.0/0
0.0.0.0/0           tcp dpt:53
     0     0 ACCEPT     udp  --  *      eth0    0.0.0.0/0
0.0.0.0/0           udp dpt:53
     0     0 ACCEPT     tcp  --  *      eth0    0.0.0.0/0
0.0.0.0/0           tcp dpt:80
     0     0 ACCEPT     udp  --  *      eth0    0.0.0.0/0
0.0.0.0/0           udp dpt:123
     0     0 ACCEPT     tcp  --  *      eth0    0.0.0.0/0
0.0.0.0/0           tcp dpt:443
     0     0 ACCEPT     tcp  --  *      eth0    0.0.0.0/0
0.0.0.0/0           tcp dpt:873
    11   924 ACCEPT     icmp --  *      *       0.0.0.0/0
0.0.0.0/0
  1071  712K ACCEPT     all  --  *      *       0.0.0.0/0
0.0.0.0/0           state RELATED,ESTABLISHED
    67 14013 LOGDROP    all  --  *      *       0.0.0.0/0
0.0.0.0/0

Chain LOGDROP (12 references)
  pkts bytes target     prot opt in     out     source
destination
    57 11655 LOG        all  --  *      *       0.0.0.0/0
0.0.0.0/0           limit: avg 1/sec burst 5 LOG flags 0 level 5 prefix
`iptables rejected: '
    70 14145 DROP       all  --  *      *       0.0.0.0/0
0.0.0.0/0

# corosync -v
Corosync Cluster Engine, version '1.4.2'
Copyright (c) 2006-2009 Red Hat, Inc.

I've been trying to solve this problem the 2 last days, without any
result. Any help welcome.

Thank ou in advance!

Regards.




_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss

_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss


[Index of Archives]     [Linux Clusters]     [Corosync Project]     [Linux USB Devel]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Linux Kernel]     [Linux SCSI]     [X.Org]

  Powered by Linux