Re: [PATCH] corosync to start in infiniband + redundant ring active/passive mode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Evgeny Barskiy napsal(a):
> Sorry for the incomplete previous message
> 
> On 21.11.2012 12:44, Evgeny Barskiy wrote:
>> On 20.11.2012 18:22, Jan Friesse wrote:
>>> Evgeny Barskiy napsal(a):
>>>> Corosync now works with infiniband transport in any redundant ring mode
>>>>
>>>> Signed-off-by: Evgeny Barskiy<barskiy@xxxxxx>
>>>> ---
>>>>   exec/totemiba.c |   10 +++++++++-
>>>>   1 files changed, 9 insertions(+), 1 deletions(-)
>>>>
>>>> diff --git a/exec/totemiba.c b/exec/totemiba.c
>>>> index 189eb00..5d47d6b 100644
>>>> --- a/exec/totemiba.c
>>>> +++ b/exec/totemiba.c
>>>> @@ -536,6 +536,7 @@ static int mcast_rdma_event_fn (int events,  int
>>>> suck,  void *context)
>>>>        */
>>>>       case RDMA_CM_EVENT_ADDR_RESOLVED:
>>>>           rdma_join_multicast (instance->mcast_cma_id,
>>>> &instance->mcast_addr, instance);
>>>> +        usleep(1000);
>>> what is this usleep good for?
> This one helps rings to be initialized in correct order. In case we
> receive RDMA_CM_EVENT_MULTICAST_JOIN message for second ring before the
> same message for the first one we will fail on assert:
> corosync: totemsrp.c:3236: memb_ring_id_create_or_load: Assertion
> `!totemip_zero_check(&memb_ring_id->rep)' failed.
> main_iface_change_fn should be firstly called for the first ring since
> function memb_ring_id_create_or_load uses ring_id which is filled only
> during function main_iface_change_fn initializes  first ring
>>

Thanks for explanation, this looks reasonable (even I don't like solution).

I'm ACKing this patch (already committed), but do you think it would be
possible to rather wait for first interface to be ready and then init
second one?

Because (example) what if first fails to initialize? Then usleep doesn't
help.

Thanks,
  Honza

>>>>           break;
>>>>       /*
>>>>        * occurs when the CM joins the multicast group
>>>> @@ -1029,6 +1030,12 @@ static int send_token_unbind (struct
>>>> totemiba_instance *instance)
>>>>           instance->totemiba_poll_handle,
>>>>           instance->send_token_channel->fd);
>>>>   +    if(instance->send_token_ah)
>>>> +    {
>>>> +        ibv_destroy_ah(instance->send_token_ah);
>>>> +        instance->send_token_ah = 0;
>>>> +    }
>>>> +
>>>>       rdma_destroy_qp (instance->send_token_cma_id);
>>>>       ibv_destroy_cq (instance->send_token_send_cq);
>>>>       ibv_destroy_cq (instance->send_token_recv_cq);
>>>> @@ -1417,7 +1424,8 @@ int totemiba_token_send (
>>>>       sge.lkey = send_buf->mr->lkey;
>>>>       sge.addr = (uintptr_t)msg;
>>>>   -    res = ibv_post_send (instance->send_token_cma_id->qp,
>>>> &send_wr, &failed_send_wr);
>>>> +    if(instance->send_token_ah != 0 && instance->send_token_bound)
>>>> +        res = ibv_post_send (instance->send_token_cma_id->qp,
>>>> &send_wr, &failed_send_wr);
>>>>         return (res);
>>>>   }
>>>>
>>
>>
> 

_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss


[Index of Archives]     [Linux Clusters]     [Corosync Project]     [Linux USB Devel]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Linux Kernel]     [Linux SCSI]     [X.Org]

  Powered by Linux