On 21.11.2012 13:29, Jan Friesse wrote:
Evgeny Barskiy napsal(a):
Sorry for the incomplete previous message
On 21.11.2012 12:44, Evgeny Barskiy wrote:
On 20.11.2012 18:22, Jan Friesse wrote:
Evgeny Barskiy napsal(a):
Corosync now works with infiniband transport in any redundant ring mode
Signed-off-by: Evgeny Barskiy<barskiy@xxxxxx>
---
exec/totemiba.c | 10 +++++++++-
1 files changed, 9 insertions(+), 1 deletions(-)
diff --git a/exec/totemiba.c b/exec/totemiba.c
index 189eb00..5d47d6b 100644
--- a/exec/totemiba.c
+++ b/exec/totemiba.c
@@ -536,6 +536,7 @@ static int mcast_rdma_event_fn (int events, int
suck, void *context)
*/
case RDMA_CM_EVENT_ADDR_RESOLVED:
rdma_join_multicast (instance->mcast_cma_id,
&instance->mcast_addr, instance);
+ usleep(1000);
what is this usleep good for?
This one helps rings to be initialized in correct order. In case we
receive RDMA_CM_EVENT_MULTICAST_JOIN message for second ring before the
same message for the first one we will fail on assert:
corosync: totemsrp.c:3236: memb_ring_id_create_or_load: Assertion
`!totemip_zero_check(&memb_ring_id->rep)' failed.
main_iface_change_fn should be firstly called for the first ring since
function memb_ring_id_create_or_load uses ring_id which is filled only
during function main_iface_change_fn initializes first ring
Thanks for explanation, this looks reasonable (even I don't like solution).
I'm ACKing this patch (already committed), but do you think it would be
possible to rather wait for first interface to be ready and then init
second one?
Because (example) what if first fails to initialize? Then usleep doesn't
help.
Exactly, actually I think we have two separated problems here:
1) To allow rings to be initialized in arbitrary order regardless
transport type
This means probably we can fix totemsrp.c :main_iface_change_fn, moving
most of its functionality inside last if statement, just before entering
to the gather state
2) Solving various initializing problem in case of infiniband transport
Currently, we will fail in any of the following problems (both rrp and
non-rrp mode):
1. interface isnt up (check it in timer_function_netif_check_timeout)
2. route wasnt resolved (do smth with RDMA_CM_EVENT_ROUTE_ERROR message
in mcast_rdma_event_fn)
3. fail to join to multicast group (do smth with
RDMA_CM_EVENT_MULTICAST_ERROR message in mcast_rdma_event_fn)
In any of these cases we will never call main_iface_change_fn for this
ring, (however in UDP version we just mark ring failed and call it
anyway), so we will never enter gather state
---
Also, I think there is probably even more serious problem (non relative
with above), what if our infiniband subnet manager downed by any reason?
Yeap this way other SM on the other blade will wake up, as I understand
from mellanox programing manual,
we will receive IBV_EVENT_SM_CHANGE message and have to reregister
multicast group etc...
Thanks,
Honza
break;
/*
* occurs when the CM joins the multicast group
@@ -1029,6 +1030,12 @@ static int send_token_unbind (struct
totemiba_instance *instance)
instance->totemiba_poll_handle,
instance->send_token_channel->fd);
+ if(instance->send_token_ah)
+ {
+ ibv_destroy_ah(instance->send_token_ah);
+ instance->send_token_ah = 0;
+ }
+
rdma_destroy_qp (instance->send_token_cma_id);
ibv_destroy_cq (instance->send_token_send_cq);
ibv_destroy_cq (instance->send_token_recv_cq);
@@ -1417,7 +1424,8 @@ int totemiba_token_send (
sge.lkey = send_buf->mr->lkey;
sge.addr = (uintptr_t)msg;
- res = ibv_post_send (instance->send_token_cma_id->qp,
&send_wr, &failed_send_wr);
+ if(instance->send_token_ah != 0 && instance->send_token_bound)
+ res = ibv_post_send (instance->send_token_cma_id->qp,
&send_wr, &failed_send_wr);
return (res);
}
_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss