Re: Redundant Infiniband Fabrics

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 09/20/2012 12:41 AM, Vladimir Voznesensky wrote:
> Hello there.
> 
> Has anybody tried to run corosync on a cluster with 2 rings on different
> ib fabrics?
> We have several issues here.
> First, usually corosync aborts:
> 
> ---8<---
> 
> # corosync -f
> 
> notice  [MAIN  ] Corosync Cluster Engine ('2.0.1'): started and ready to
> provide service.
> 
> info    [MAIN  ] Corosync built-in features: testagents rdma monitoring
> 
> Sep 20 11:28:50 notice  [TOTEM ] Initializing transport (Infiniband/IP).
> 
> Sep 20 11:28:50 notice  [TOTEM ] Initializing transport (Infiniband/IP).
> 
> corosync: totemsrp.c:3236: memb_ring_id_create_or_load: Assertion
> `!totemip_zero_check(&memb_ring_id->rep)' failed.
> 
> Ringbuffer:
> 
>  ->OVERWRITE
> 
>  ->write_pt [736]
> 
>  ->read_pt [0]
> 
>  ->size [2097152 words]
> 
>  =>free [8385660 bytes]
> 
>  =>used [2944 bytes]
> 
> Aborted
> 
> --->8---
> 
> Then, from time to time it just seg. faults:
> 
> ---8<---
> 
> # corosync -f
> 
> notice  [MAIN  ] Corosync Cluster Engine ('2.0.1'): started and ready to
> provide service.
> 
> info    [MAIN  ] Corosync built-in features: testagents rdma monitoring
> 
> Sep 20 11:28:51 notice  [TOTEM ] Initializing transport (Infiniband/IP).
> 
> Sep 20 11:28:51 notice  [TOTEM ] Initializing transport (Infiniband/IP).
> 
> Sep 20 11:28:51 notice  [SERV  ] Service engine loaded: corosync
> configuration map access [0]
> 
> Sep 20 11:28:51 info    [QB    ] server name: cmap
> 
> Sep 20 11:28:51 notice  [SERV  ] Service engine loaded: corosync
> configuration service [1]
> 
> Sep 20 11:28:51 info    [QB    ] server name: cfg
> 
> Sep 20 11:28:51 notice  [SERV  ] Service engine loaded: corosync cluster
> closed process group service v1.01 [2]
> 
> Sep 20 11:28:51 info    [QB    ] server name: cpg
> 
> Sep 20 11:28:51 notice  [SERV  ] Service engine loaded: corosync profile
> loading service [4]
> 
> Sep 20 11:28:51 notice  [SERV  ] Service engine loaded: corosync
> resource monitoring service [6]
> 
> Sep 20 11:28:51 notice  [QUORUM] Using quorum provider corosync_votequorum
> 
> Sep 20 11:28:51 notice  [SERV  ] Service engine loaded: corosync vote
> quorum service v1.0 [5]
> 
> Sep 20 11:28:51 info    [QB    ] server name: votequorum
> 
> Sep 20 11:28:51 notice  [SERV  ] Service engine loaded: corosync cluster
> quorum service v0.1 [3]
> 
> Sep 20 11:28:51 info    [QB    ] server name: quorum
> 
> Ringbuffer:
> 
>  ->OVERWRITE
> 
>  ->write_pt [2776]
> 
>  ->read_pt [0]
> 
>  ->size [2097152 words]
> 
>  =>free [8377500 bytes]
> 
>  =>used [11104 bytes]
> 
> Segmentation fault
> 
> --->8---
> 
> And sometimes it starts.
> Then, when the engines start on 5 nodes, two of them show errors like:
> 
> ---8<---
> 
> ...
> mlx4: local QP operation err (QPN 32004d, WQE index 0, vendor syndrome
> 6b, opcode = 5e)
> mlx4: local QP operation err (QPN 3a004d, WQE index 0, vendor syndrome
> 6b, opcode = 5e)
> ...
> 
> --->8---
> 
> And the last one cannot join the others in several seconds.
> 
> ---8<---
> ...
> Sep 20 11:14:39 notice  [MAIN  ] Completed service synchronization,
> ready to provide service.
> Sep 20 11:14:39 notice  [QUORUM] Members[1]: 83929280
> Sep 20 11:14:39 notice  [TOTEM ] A processor joined or left the
> membership and a new membership (192.168.0.5:256) was formed.
> Sep 20 11:14:39 notice  [MAIN  ] Completed service synchronization,
> ready to provide service.
> Sep 20 11:14:42 notice  [QUORUM] Members[1]: 83929280
> Sep 20 11:14:42 notice  [TOTEM ] A processor joined or left the
> membership and a new membership (192.168.0.5:264) was formed.
> ...
> --->8---
> 
> The corosync.conf is:
> 
> ---8<---
> 
> totem {
> 
>         version: 2
> 
>         # How long before declaring a token lost (ms)
> 
>         token: 3000
> 
>         # How many token retransmits before forming a new configuration
> 
>         token_retransmits_before_loss_const: 10
> 
>         # How long to wait for join messages in the membership protocol
> (ms)
> 
>         join: 60
> 
>         # How long to wait for consensus to be achieved before starting
> a new round of membership configuration (ms)
> 
>         consensus: 3600
> 
>         # Turn off the virtual synchrony filter
> 
>         vsftype: none
> 
>         # Number of messages that may be sent by one processor on
> receipt of the token
> 
>         max_messages: 20
> 
>         # Limit generated nodeids to 31-bits (positive signed integers)
> 
>         clear_node_high_bit: yes
> 
>         # Disable encryption
> 
>         secauth: off
> 
>         # How many threads to use for encryption/decryption
> 
>         threads: 0
> 
>         # Optionally assign a fixed node id (integer)
> 
>         # nodeid: 1234
> 
>         # This specifies the mode of redundant ring, which may be none,
> active, or passive.
> 
>         rrp_mode: passive
> 
>         interface {
> 
>                 # The following values need to be set based on your
> environment
> 
>                 ringnumber: 0
> 
>                 bindnetaddr: 192.168.0.0
> 
>                 mcastaddr: 192.168.0.255
> 
>                 mcastport: 5405
> 
>         }
> 
>         interface {
> 
>                 # The following values need to be set based on your
> environment
> 
>                 ringnumber: 1
> 
>                 bindnetaddr: 192.168.1.0
> 
>                 mcastaddr: 192.168.1.255
> 
>                 mcastport: 5405
> 
>         }
> 
>         netmtu: 2044
> 
>         transport: iba
> 
> }
> 
> amf {
> 
>         mode: disabled
> 
> }
> 
> service {
> 
>         # Load the Pacemaker Cluster Resource Manager
> 
>         ver:       0
> 
>         name:      pacemaker
> 
> }
> 
> aisexec {
> 
>         user:   root
> 
>         group:  root
> 
> }
> 
> logging {
> 
>         fileline: off
> 
>         to_stderr: yes
> 
>         to_logfile: no
> 
>         to_syslog: yes
> 
>         syslog_facility: daemon
> 
>         debug: off
> 
>         timestamp: on
> 
>         logger_subsys {
> 
>                 subsys: AMF
> 
>                 debug: off
> 
>                 tags: enter|leave|trace1|trace2|trace3|trace4|trace6
> 
>         }
> 
> }
> 
> quorum {
> 
>         # Enable and configure quorum subsystem (default: off)
> 
>         # see also corosync.conf.5 and votequorum.5
> 
>         provider: corosync_votequorum
> 
>         expected_votes: 3
> 
> }
> 
> --->8---
> 
> We run self-compiled cs on a Debian squeeze hosts.
> 

Which kernel version?

Old kernel versions have severe bugs in the infiniband implementation.
This is not to say there are not infiniband bugs in the totem
implementation.  On our roadmap is to implement RDMA-UD mode for totem
to make the results more reliable, but unfortunately this never seems to
bubble to the top.

(atm we are using the rdma connected mode which is suboptimal)

> Thank you.
> _______________________________________________
> discuss mailing list
> discuss@xxxxxxxxxxxx
> http://lists.corosync.org/mailman/listinfo/discuss

_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss


[Index of Archives]     [Linux Clusters]     [Corosync Project]     [Linux USB Devel]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Linux Kernel]     [Linux SCSI]     [X.Org]

  Powered by Linux