Hello, Here is some changes to allow corosync run in iba + rrp mode (problem #2, described in http://lists.corosync.org/pipermail/discuss/2012-October/002086.html): ------ totemiba.c line 1031 send_token_unbind +if(instance->send_token_ah) +{ + ibv_destroy_ah(instance->send_token_ah); + instance->send_token_ah = 0; +} ------ totemiba.c line 1419 totemiba_token_send +if(instance->send_token_ah) res = ibv_post_send (instance->send_token_cma_id->qp, &send_wr, &failed_send_wr); It looks like its initializing, joining cpg and running normally after this small fix The other one problem (problem #1, described in http://lists.corosync.org/pipermail/discuss/2012-October/002086.html): is kinda more interesting, yes its only occurs during program strarting but this is just side effect What if we have one switch down during corosync start? If its first switch - assert in memb_ring_id_create_or_load If its any other one - infinite loop or even segfault this event will never happen: so main_iface_change_fn will not be called enough times and we will not enter to the gathering state in totemudp there is checking if interface is down and even if it down we call main_iface_change function so I think somwhere here we should at least check "interface_up" variable like in udp version also we should probably setup timer which will retry to initialize it later Next question is if its possible just to loose RDMA_CM_EVENT_MULTICAST_JOIN event? If yes, we will have infinite loop this way, probable some timer is required? Evgeny |
_______________________________________________ discuss mailing list discuss@xxxxxxxxxxxx http://lists.corosync.org/mailman/listinfo/discuss