Re: [Pacemaker] CoroSync's UDPu transport for public IP addresses?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dmitry Koterov napsal(a):

such messages (for now). But, anyway, DNS names in ringX_addr seem not
working, and no relevant messages are in default logs. Maybe add some
validations for ringX_addr?

I'm having resolvable DNS names:

root@node1:/etc/corosync# ping -c1 -W100 node1 | grep from
64 bytes from node1 (127.0.1.1): icmp_seq=1 ttl=64 time=0.039 ms


This is problem. Resolving node1 to localhost (127.0.0.1) is simply
wrong. Names you want to use in corosync.conf should resolve to
interface address. I believe other nodes has similar setting (so node2
resolved on node2 is again 127.0.0.1)


Wow! What a shame! How could I miss it... So you're absolutely right,
thanks: that was the cause, an entry in /etc/hosts. On some machines I
removed it manually, but on others - didn't. Now I do it automatically
by sed -i -r "/^.*[[:space:]]$host([[:space:]]|\$)/d" /etc/hosts in the
initialization script.

I apologize for the mess.

So now I have only one place in corosync.conf where I need to specify a
plain IP address for UDPu: totem.interface.bindnetaddr. If I specify
0.0.0.0 there, I'm having a message "Service engine 'corosync_quorum'
failed to load for reason 'configuration error: nodelist or
quorum.expected_votes must be configured!'" in the logs (BTW it does not
say that I mistaked in bindnetaddr). Is there a way to completely untie
from IP addresses?

You can just remove whole interface section completely. Corosync will find correct address from nodelist.

Regards,
  Honza




Please try to fix this problem first and let's see if this will solve
issue you are hitting.

Regards,
   Honza

root@node1:/etc/corosync# ping -c1 -W100 node2 | grep from
64 bytes from node2 (188.166.54.190): icmp_seq=1 ttl=55 time=88.3 ms

root@node1:/etc/corosync# ping -c1 -W100 node3 | grep from
64 bytes from node3 (128.199.116.218): icmp_seq=1 ttl=51 time=252 ms


With corosync.conf below, nothing works:
...
nodelist {
   node {
     ring0_addr: node1
   }
   node {
     ring0_addr: node2
   }
   node {
     ring0_addr: node3
   }
}
...
Jan 14 10:47:44 node1 corosync[15061]:  [MAIN  ] Corosync Cluster Engine
('2.3.3'): started and ready to provide service.
Jan 14 10:47:44 node1 corosync[15061]:  [MAIN  ] Corosync built-in
features: dbus testagents rdma watchdog augeas pie relro bindnow
Jan 14 10:47:44 node1 corosync[15062]:  [TOTEM ] Initializing transport
(UDP/IP Unicast).
Jan 14 10:47:44 node1 corosync[15062]:  [TOTEM ] Initializing
transmit/receive security (NSS) crypto: aes256 hash: sha1
Jan 14 10:47:44 node1 corosync[15062]:  [TOTEM ] The network interface
[a.b.c.d] is now up.
Jan 14 10:47:44 node1 corosync[15062]:  [SERV  ] Service engine loaded:
corosync configuration map access [0]
Jan 14 10:47:44 node1 corosync[15062]:  [QB    ] server name: cmap
Jan 14 10:47:44 node1 corosync[15062]:  [SERV  ] Service engine loaded:
corosync configuration service [1]
Jan 14 10:47:44 node1 corosync[15062]:  [QB    ] server name: cfg
Jan 14 10:47:44 node1 corosync[15062]:  [SERV  ] Service engine loaded:
corosync cluster closed process group service v1.01 [2]
Jan 14 10:47:44 node1 corosync[15062]:  [QB    ] server name: cpg
Jan 14 10:47:44 node1 corosync[15062]:  [SERV  ] Service engine loaded:
corosync profile loading service [4]
Jan 14 10:47:44 node1 corosync[15062]:  [WD    ] No Watchdog, try
modprobe
<a watchdog>
Jan 14 10:47:44 node1 corosync[15062]:  [WD    ] no resources configured.
Jan 14 10:47:44 node1 corosync[15062]:  [SERV  ] Service engine loaded:
corosync watchdog service [7]
Jan 14 10:47:44 node1 corosync[15062]:  [QUORUM] Using quorum provider
corosync_votequorum
Jan 14 10:47:44 node1 corosync[15062]:  [QUORUM] Quorum provider:
corosync_votequorum failed to initialize.
Jan 14 10:47:44 node1 corosync[15062]:  [SERV  ] Service engine
'corosync_quorum' failed to load for reason 'configuration error:
nodelist
or quorum.expected_votes must be configured!'
Jan 14 10:47:44 node1 corosync[15062]:  [MAIN  ] Corosync Cluster Engine
exiting with status 20 at service.c:356.


But with IP addresses specified in ringX_addr, everything works:
...
nodelist {
   node {
     ring0_addr: 104.236.71.79
   }
   node {
     ring0_addr: 188.166.54.190
   }
   node {
     ring0_addr: 128.199.116.218
   }
}
...
Jan 14 10:48:28 node1 corosync[15155]:  [MAIN  ] Corosync Cluster Engine
('2.3.3'): started and ready to provide service.
Jan 14 10:48:28 node1 corosync[15155]:  [MAIN  ] Corosync built-in
features: dbus testagents rdma watchdog augeas pie relro bindnow
Jan 14 10:48:28 node1 corosync[15156]:  [TOTEM ] Initializing transport
(UDP/IP Unicast).
Jan 14 10:48:28 node1 corosync[15156]:  [TOTEM ] Initializing
transmit/receive security (NSS) crypto: aes256 hash: sha1
Jan 14 10:48:28 node1 corosync[15156]:  [TOTEM ] The network interface
[a.b.c.d] is now up.
Jan 14 10:48:28 node1 corosync[15156]:  [SERV  ] Service engine loaded:
corosync configuration map access [0]
Jan 14 10:48:28 node1 corosync[15156]:  [QB    ] server name: cmap
Jan 14 10:48:28 node1 corosync[15156]:  [SERV  ] Service engine loaded:
corosync configuration service [1]
Jan 14 10:48:28 node1 corosync[15156]:  [QB    ] server name: cfg
Jan 14 10:48:28 node1 corosync[15156]:  [SERV  ] Service engine loaded:
corosync cluster closed process group service v1.01 [2]
Jan 14 10:48:28 node1 corosync[15156]:  [QB    ] server name: cpg
Jan 14 10:48:28 node1 corosync[15156]:  [SERV  ] Service engine loaded:
corosync profile loading service [4]
Jan 14 10:48:28 node1 corosync[15156]:  [WD    ] No Watchdog, try
modprobe
<a watchdog>
Jan 14 10:48:28 node1 corosync[15156]:  [WD    ] no resources configured.
Jan 14 10:48:28 node1 corosync[15156]:  [SERV  ] Service engine loaded:
corosync watchdog service [7]
Jan 14 10:48:28 node1 corosync[15156]:  [QUORUM] Using quorum provider
corosync_votequorum
Jan 14 10:48:28 node1 corosync[15156]:  [SERV  ] Service engine loaded:
corosync vote quorum service v1.0 [5]
Jan 14 10:48:28 node1 corosync[15156]:  [QB    ] server name: votequorum
Jan 14 10:48:28 node1 corosync[15156]:  [SERV  ] Service engine loaded:
corosync cluster quorum service v0.1 [3]
Jan 14 10:48:28 node1 corosync[15156]:  [QB    ] server name: quorum
Jan 14 10:48:28 node1 corosync[15156]:  [TOTEM ] adding new UDPU member
{a.b.c.d}
Jan 14 10:48:28 node1 corosync[15156]:  [TOTEM ] adding new UDPU member
{e.f.g.h}
Jan 14 10:48:28 node1 corosync[15156]:  [TOTEM ] adding new UDPU member
{i.j.k.l}
Jan 14 10:48:28 node1 corosync[15156]:  [TOTEM ] A new membership
(m.n.o.p:80) was formed. Members joined: 1760315215
Jan 14 10:48:28 node1 corosync[15156]:  [QUORUM] Members[1]: 1760315215
Jan 14 10:48:28 node1 corosync[15156]:  [MAIN  ] Completed service
synchronization, ready to provide service.


On Mon, Jan 5, 2015 at 6:45 PM, Jan Friesse <jfriesse@xxxxxxxxxx> wrote:

Dmitry,


Sure, in logs I see "adding new UDPU member {IP_ADDRESS}" (so DNS names
are definitely resolved), but in practice the cluster does not work,
as I
said above. So validations of ringX_addr in corosync.conf would be very
helpful in corosync.

that's weird. Because as long as DNS is resolved, corosync works only
with IP. This means, code path is exactly same with IP or with DNS. Do
you have logs from corosync?

Honza



On Fri, Jan 2, 2015 at 2:49 PM, Jan Friesse <jfriesse@xxxxxxxxxx>
wrote:

Dmitry,


  No, I meant that if you pass a domain name in ring0_addr, there are
no
errors in logs, corosync even seems to find nodes (based on its
logs),
And
crm_node -l shows them, but in practice nothing really works. A
verbose
error message would be very helpful in such case.


This sounds weird. Are you sure that DNS names really maps to correct
IP
address? In logs there should be something like "adding new UDPU
member
{IP_ADDRESS}".

Regards,
   Honza


On Tuesday, December 30, 2014, Daniel Dehennin <
daniel.dehennin@xxxxxxxxxxxx>
wrote:

  Dmitry Koterov <dmitry.koterov@xxxxxxxxx <javascript:;>> writes:

  Oh, seems I've found the solution! At least two mistakes was in my
corosync.conf (BTW logs did not say about any errors, so my
conclusion
is
based on my experiments only).

1. nodelist.node MUST contain only IP addresses. No hostnames! They

simply

do not work, "crm status" shows no nodes. And no warnings are in
logs
regarding this.


You can add name like this:

      nodelist {
        node {
          ring0_addr: <public-ip-address-of-the-first-machine>
          name: node1
        }
        node {
          ring0_addr: <public-ip-address-of-the-second-machine>
          name: node2
        }
      }

I used it on Ubuntu Trusty with udpu.

Regards.

--
Daniel Dehennin
Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF
Fingerprint: 3E69 014E 5C23 50E8 9ED6  2AAD CC1E 9E5B 7A6F E2DF




_______________________________________________
Pacemaker mailing list: Pacemaker@xxxxxxxxxxxxxxxxxxx
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started:
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



_______________________________________________
Pacemaker mailing list: Pacemaker@xxxxxxxxxxxxxxxxxxx
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started:
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org




_______________________________________________
Pacemaker mailing list: Pacemaker@xxxxxxxxxxxxxxxxxxx
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started:
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



_______________________________________________
Pacemaker mailing list: Pacemaker@xxxxxxxxxxxxxxxxxxx
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started:
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org




_______________________________________________
Pacemaker mailing list: Pacemaker@xxxxxxxxxxxxxxxxxxx
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss



_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss




[Index of Archives]     [Linux Clusters]     [Corosync Project]     [Linux USB Devel]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Linux Kernel]     [Linux SCSI]     [X.Org]

  Powered by Linux