Re: [Pacemaker] CoroSync's UDPu transport for public IP addresses?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> such messages (for now). But, anyway, DNS names in ringX_addr seem not
> working, and no relevant messages are in default logs. Maybe add some
> validations for ringX_addr?
>
> I'm having resolvable DNS names:
>
> root@node1:/etc/corosync# ping -c1 -W100 node1 | grep from
> 64 bytes from node1 (127.0.1.1): icmp_seq=1 ttl=64 time=0.039 ms
>

This is problem. Resolving node1 to localhost (127.0.0.1) is simply
wrong. Names you want to use in corosync.conf should resolve to
interface address. I believe other nodes has similar setting (so node2
resolved on node2 is again 127.0.0.1)

Wow! What a shame! How could I miss it... So you're absolutely right, thanks: that was the cause, an entry in /etc/hosts. On some machines I removed it manually, but on others - didn't. Now I do it automatically by sed -i -r "/^.*[[:space:]]$host([[:space:]]|\$)/d" /etc/hosts in the initialization script.

I apologize for the mess.

So now I have only one place in corosync.conf where I need to specify a plain IP address for UDPu: totem.interface.bindnetaddr. If I specify 0.0.0.0 there, I'm having a message "Service engine 'corosync_quorum' failed to load for reason 'configuration error: nodelist or quorum.expected_votes must be configured!'" in the logs (BTW it does not say that I mistaked in bindnetaddr). Is there a way to completely untie from IP addresses?

 
Please try to fix this problem first and let's see if this will solve
issue you are hitting.

Regards,
  Honza

> root@node1:/etc/corosync# ping -c1 -W100 node2 | grep from
> 64 bytes from node2 (188.166.54.190): icmp_seq=1 ttl=55 time=88.3 ms
>
> root@node1:/etc/corosync# ping -c1 -W100 node3 | grep from
> 64 bytes from node3 (128.199.116.218): icmp_seq=1 ttl=51 time=252 ms
>
>
> With corosync.conf below, nothing works:
> ...
> nodelist {
>   node {
>     ring0_addr: node1
>   }
>   node {
>     ring0_addr: node2
>   }
>   node {
>     ring0_addr: node3
>   }
> }
> ...
> Jan 14 10:47:44 node1 corosync[15061]:  [MAIN  ] Corosync Cluster Engine
> ('2.3.3'): started and ready to provide service.
> Jan 14 10:47:44 node1 corosync[15061]:  [MAIN  ] Corosync built-in
> features: dbus testagents rdma watchdog augeas pie relro bindnow
> Jan 14 10:47:44 node1 corosync[15062]:  [TOTEM ] Initializing transport
> (UDP/IP Unicast).
> Jan 14 10:47:44 node1 corosync[15062]:  [TOTEM ] Initializing
> transmit/receive security (NSS) crypto: aes256 hash: sha1
> Jan 14 10:47:44 node1 corosync[15062]:  [TOTEM ] The network interface
> [a.b.c.d] is now up.
> Jan 14 10:47:44 node1 corosync[15062]:  [SERV  ] Service engine loaded:
> corosync configuration map access [0]
> Jan 14 10:47:44 node1 corosync[15062]:  [QB    ] server name: cmap
> Jan 14 10:47:44 node1 corosync[15062]:  [SERV  ] Service engine loaded:
> corosync configuration service [1]
> Jan 14 10:47:44 node1 corosync[15062]:  [QB    ] server name: cfg
> Jan 14 10:47:44 node1 corosync[15062]:  [SERV  ] Service engine loaded:
> corosync cluster closed process group service v1.01 [2]
> Jan 14 10:47:44 node1 corosync[15062]:  [QB    ] server name: cpg
> Jan 14 10:47:44 node1 corosync[15062]:  [SERV  ] Service engine loaded:
> corosync profile loading service [4]
> Jan 14 10:47:44 node1 corosync[15062]:  [WD    ] No Watchdog, try modprobe
> <a watchdog>
> Jan 14 10:47:44 node1 corosync[15062]:  [WD    ] no resources configured.
> Jan 14 10:47:44 node1 corosync[15062]:  [SERV  ] Service engine loaded:
> corosync watchdog service [7]
> Jan 14 10:47:44 node1 corosync[15062]:  [QUORUM] Using quorum provider
> corosync_votequorum
> Jan 14 10:47:44 node1 corosync[15062]:  [QUORUM] Quorum provider:
> corosync_votequorum failed to initialize.
> Jan 14 10:47:44 node1 corosync[15062]:  [SERV  ] Service engine
> 'corosync_quorum' failed to load for reason 'configuration error: nodelist
> or quorum.expected_votes must be configured!'
> Jan 14 10:47:44 node1 corosync[15062]:  [MAIN  ] Corosync Cluster Engine
> exiting with status 20 at service.c:356.
>
>
> But with IP addresses specified in ringX_addr, everything works:
> ...
> nodelist {
>   node {
>     ring0_addr: 104.236.71.79
>   }
>   node {
>     ring0_addr: 188.166.54.190
>   }
>   node {
>     ring0_addr: 128.199.116.218
>   }
> }
> ...
> Jan 14 10:48:28 node1 corosync[15155]:  [MAIN  ] Corosync Cluster Engine
> ('2.3.3'): started and ready to provide service.
> Jan 14 10:48:28 node1 corosync[15155]:  [MAIN  ] Corosync built-in
> features: dbus testagents rdma watchdog augeas pie relro bindnow
> Jan 14 10:48:28 node1 corosync[15156]:  [TOTEM ] Initializing transport
> (UDP/IP Unicast).
> Jan 14 10:48:28 node1 corosync[15156]:  [TOTEM ] Initializing
> transmit/receive security (NSS) crypto: aes256 hash: sha1
> Jan 14 10:48:28 node1 corosync[15156]:  [TOTEM ] The network interface
> [a.b.c.d] is now up.
> Jan 14 10:48:28 node1 corosync[15156]:  [SERV  ] Service engine loaded:
> corosync configuration map access [0]
> Jan 14 10:48:28 node1 corosync[15156]:  [QB    ] server name: cmap
> Jan 14 10:48:28 node1 corosync[15156]:  [SERV  ] Service engine loaded:
> corosync configuration service [1]
> Jan 14 10:48:28 node1 corosync[15156]:  [QB    ] server name: cfg
> Jan 14 10:48:28 node1 corosync[15156]:  [SERV  ] Service engine loaded:
> corosync cluster closed process group service v1.01 [2]
> Jan 14 10:48:28 node1 corosync[15156]:  [QB    ] server name: cpg
> Jan 14 10:48:28 node1 corosync[15156]:  [SERV  ] Service engine loaded:
> corosync profile loading service [4]
> Jan 14 10:48:28 node1 corosync[15156]:  [WD    ] No Watchdog, try modprobe
> <a watchdog>
> Jan 14 10:48:28 node1 corosync[15156]:  [WD    ] no resources configured.
> Jan 14 10:48:28 node1 corosync[15156]:  [SERV  ] Service engine loaded:
> corosync watchdog service [7]
> Jan 14 10:48:28 node1 corosync[15156]:  [QUORUM] Using quorum provider
> corosync_votequorum
> Jan 14 10:48:28 node1 corosync[15156]:  [SERV  ] Service engine loaded:
> corosync vote quorum service v1.0 [5]
> Jan 14 10:48:28 node1 corosync[15156]:  [QB    ] server name: votequorum
> Jan 14 10:48:28 node1 corosync[15156]:  [SERV  ] Service engine loaded:
> corosync cluster quorum service v0.1 [3]
> Jan 14 10:48:28 node1 corosync[15156]:  [QB    ] server name: quorum
> Jan 14 10:48:28 node1 corosync[15156]:  [TOTEM ] adding new UDPU member
> {a.b.c.d}
> Jan 14 10:48:28 node1 corosync[15156]:  [TOTEM ] adding new UDPU member
> {e.f.g.h}
> Jan 14 10:48:28 node1 corosync[15156]:  [TOTEM ] adding new UDPU member
> {i.j.k.l}
> Jan 14 10:48:28 node1 corosync[15156]:  [TOTEM ] A new membership
> (m.n.o.p:80) was formed. Members joined: 1760315215
> Jan 14 10:48:28 node1 corosync[15156]:  [QUORUM] Members[1]: 1760315215
> Jan 14 10:48:28 node1 corosync[15156]:  [MAIN  ] Completed service
> synchronization, ready to provide service.
>
>
> On Mon, Jan 5, 2015 at 6:45 PM, Jan Friesse <jfriesse@xxxxxxxxxx> wrote:
>
>> Dmitry,
>>
>>
>>> Sure, in logs I see "adding new UDPU member {IP_ADDRESS}" (so DNS names
>>> are definitely resolved), but in practice the cluster does not work, as I
>>> said above. So validations of ringX_addr in corosync.conf would be very
>>> helpful in corosync.
>>
>> that's weird. Because as long as DNS is resolved, corosync works only
>> with IP. This means, code path is exactly same with IP or with DNS. Do
>> you have logs from corosync?
>>
>> Honza
>>
>>
>>>
>>> On Fri, Jan 2, 2015 at 2:49 PM, Jan Friesse <jfriesse@xxxxxxxxxx> wrote:
>>>
>>>> Dmitry,
>>>>
>>>>
>>>>  No, I meant that if you pass a domain name in ring0_addr, there are no
>>>>> errors in logs, corosync even seems to find nodes (based on its logs),
>> And
>>>>> crm_node -l shows them, but in practice nothing really works. A verbose
>>>>> error message would be very helpful in such case.
>>>>>
>>>>
>>>> This sounds weird. Are you sure that DNS names really maps to correct IP
>>>> address? In logs there should be something like "adding new UDPU member
>>>> {IP_ADDRESS}".
>>>>
>>>> Regards,
>>>>   Honza
>>>>
>>>>
>>>>> On Tuesday, December 30, 2014, Daniel Dehennin <
>>>>> daniel.dehennin@xxxxxxxxxxxx>
>>>>> wrote:
>>>>>
>>>>>  Dmitry Koterov <dmitry.koterov@xxxxxxxxx <_javascript_:;>> writes:
>>>>>>
>>>>>>  Oh, seems I've found the solution! At least two mistakes was in my
>>>>>>> corosync.conf (BTW logs did not say about any errors, so my
>> conclusion
>>>>>>> is
>>>>>>> based on my experiments only).
>>>>>>>
>>>>>>> 1. nodelist.node MUST contain only IP addresses. No hostnames! They
>>>>>>>
>>>>>> simply
>>>>>>
>>>>>>> do not work, "crm status" shows no nodes. And no warnings are in logs
>>>>>>> regarding this.
>>>>>>>
>>>>>>
>>>>>> You can add name like this:
>>>>>>
>>>>>>      nodelist {
>>>>>>        node {
>>>>>>          ring0_addr: <public-ip-address-of-the-first-machine>
>>>>>>          name: node1
>>>>>>        }
>>>>>>        node {
>>>>>>          ring0_addr: <public-ip-address-of-the-second-machine>
>>>>>>          name: node2
>>>>>>        }
>>>>>>      }
>>>>>>
>>>>>> I used it on Ubuntu Trusty with udpu.
>>>>>>
>>>>>> Regards.
>>>>>>
>>>>>> --
>>>>>> Daniel Dehennin
>>>>>> Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF
>>>>>> Fingerprint: 3E69 014E 5C23 50E8 9ED6  2AAD CC1E 9E5B 7A6F E2DF
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Pacemaker mailing list: Pacemaker@xxxxxxxxxxxxxxxxxxx
>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>
>>>>> Project Home: http://www.clusterlabs.org
>>>>> Getting started:
>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> Pacemaker mailing list: Pacemaker@xxxxxxxxxxxxxxxxxxx
>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started:
>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://bugs.clusterlabs.org
>>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker@xxxxxxxxxxxxxxxxxxx
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>>
>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker@xxxxxxxxxxxxxxxxxxx
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker@xxxxxxxxxxxxxxxxxxx
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>

_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss

_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss

[Index of Archives]     [Linux Clusters]     [Corosync Project]     [Linux USB Devel]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Linux Kernel]     [Linux SCSI]     [X.Org]

  Powered by Linux