Re: Corosync with two servers around the world

Jan Friesse <jfriesse@xxxxxxxxxx> · Tue, 23 Apr 2013 10:04:26 +0200

Alain,
first make sure that corosync itself works. So remove/comment out

service {
      # Load the Pacemaker Cluster Resource Manager
      ver:       0
      name:      pacemaker
}

Now.

Ether change

to_stderr: yes
to_logfile: no

to

to_stderr: no
to_logfile: yes

or execute corosync as corosync -f to see startup messages. Ether way,
you should have message like "The network interface [123.456.789] is now
up." on both nodes in /var/log/cluster/corosync.log (with to_logfile:
yes), or on stderr (with to_stderr:yes and corosync -f).

IFF you have this messages shown on both nodes, you are on good way.
Nodes should see each other if there is no firewall blocking somewhere.
To test that, exec corosync-cmapctl (for corosync needle 2.x) or
corosync-objctl (for flatiron 1.x)  and there should be
runtime.totem.pg.mrp.srp.members.ip and status joined (on each of node,
you should see both nodes).

If that applies, corosync works fine and problem is in pacemaker (did
you changed to ver:1? If so, did you executed pacemaker daemon?)

Honza

alain meunier napsal(a):
> Hi Honza,
> 
> Multicast is working. I tried with ssmping.
> ping is working as well with new entries in /etc/host (pcmk-1 and pcmk-2 are able to ping).
> 
> Applying your advices, crm_mon hangs with "Attempting connection to the cluster........................."
> There is no firewall rule. Openvz vps.
> 
> Logs are kept empty.
> 
> I don't know what is going wrong there..
> Any clue ?
> Alain
> 
> 
>> Date: Tue, 23 Apr 2013 09:37:47 +0200
>> From: jfriesse@xxxxxxxxxx
>> To: deco33@xxxxxxxxxx
>> CC: discuss@xxxxxxxxxxxx
>> Subject: Re:  Corosync with two servers around the world
>>
>> Alain,
>> you are using multicast, so make sure that it's possible to pass
>> multicast (you can use omping for that). Now, if multicast is not going
>> thru, take a look to UDPU transport in corosync (man corosync.conf and
>> as a example of configuration cat /etc/corosync/corosync.conf.example.udpu).
>>
>> Also bindnetaddr: must be set on each node to address/network you want
>> to bind to, so on node 1, you must have 123.456.789 and node 2 213.546.879.
>>
>> Also version 0 of pacemaker plugin is not recommended to use (use ver: 1).
>>
>> Honza
>>
>> alain meunier napsal(a):
>>> Hi!
>>>
>>> I rented two servers with ip say (fake example ip)  : 123.456.789 (node1) and another one 213.546.879 (node2) 
>>>
>>> I would like node2 to failover node1 if node1 dies.
>>>
>>> I set up corosync, pacemaker but crm_mon cannot see the other node.
>>> The only node up is the local one.
>>>
>>> Should I make a vpn between those ?
>>>
>>> here is my corosync conf :
>>>
>>> node 1 & 2 :
>>> [code]
>>>  # Please read the openais.conf.5 manual page
>>>
>>> totem {
>>>     version: 2
>>>
>>>     # How long before declaring a token lost (ms)
>>>     token: 5000
>>>
>>>     # How many token retransmits before forming a new configuration
>>>     token_retransmits_before_loss_const: 20
>>>
>>>     # How long to wait for join messages in the membership protocol (ms)
>>>     join: 1000
>>>
>>>     # How long to wait for consensus to be achieved before starting a new round of membership configuration (ms)
>>>     consensus: 7500
>>>
>>>     # Turn off the virtual synchrony filter
>>>     vsftype: none
>>>
>>>     # Number of messages that may be sent by one processor on receipt of the token
>>>     max_messages: 20
>>>
>>>     # Limit generated nodeids to 31-bits (positive signed integers)
>>>     clear_node_high_bit: yes
>>>
>>>     # Disable encryption
>>>      secauth: on
>>>
>>>     # How many threads to use for encryption/decryption
>>>      threads: 0
>>>
>>>     # Optionally assign a fixed node id (integer)
>>>     # nodeid: 1234
>>>
>>>     # This specifies the mode of redundant ring, which may be none, active, or passive.
>>>      rrp_mode: none
>>>
>>>      interface {
>>>
>>>         # The following values need to be set based on your environment 
>>>         ringnumber: 0
>>>         #bindnetaddr: 127.0.0.1
>>>         bindnetaddr: 123.456.789 
>>>         mcastaddr: 226.14.1.1
>>>         mcastport: 5405
>>>     }
>>> }
>>>
>>> amf {
>>>     mode: disabled
>>> }
>>>
>>> service {
>>>      # Load the Pacemaker Cluster Resource Manager
>>>      ver:       0
>>>      name:      pacemaker
>>> }
>>>
>>> aisexec {
>>>         user:   root
>>>         group:  root
>>> }
>>>
>>> logging {
>>>         fileline: off
>>>         to_stderr: yes
>>>         to_logfile: no
>>>         to_syslog: yes
>>>     syslog_facility: daemon
>>>         debug: off
>>>         timestamp: on
>>>         logger_subsys {
>>>                 subsys: AMF
>>>                 debug: off
>>>                 tags: enter|leave|trace1|trace2|trace3|trace4|trace6
>>>         }
>>> }
>>> [/code]
>>>
>>> The bindnetaddr is the node I want to back up.
>>>
>>> What am I doing wrong, please ?
>>>
>>> Many thanks,
>>>
>>> Alain
>>>  		 	   		  
>>>
>>>
>>>
>>> _______________________________________________
>>> discuss mailing list
>>> discuss@xxxxxxxxxxxx
>>> http://lists.corosync.org/mailman/listinfo/discuss
>>>
>>
>  		 	   		  
> 

_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss