RE: [PATCH for-next] RDMA/siw: Fix duplicated reported IW_CM_EVENT_CONNECT_REPLY event

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> -----Original Message-----
> From: Cheng Xu <chengyou@xxxxxxxxxxxxxxxxx>
> Sent: Thursday, 14 July 2022 15:20
> To: Bernard Metzler <BMT@xxxxxxxxxxxxxx>; jgg@xxxxxxxx; leon@xxxxxxxxxx
> Cc: linux-rdma@xxxxxxxxxxxxxxx
> Subject: [EXTERNAL] Re: [PATCH for-next] RDMA/siw: Fix duplicated reported
> IW_CM_EVENT_CONNECT_REPLY event
> 
> 
> 
> On 7/14/22 8:59 PM, Bernard Metzler wrote:
> >> -----Original Message-----
> >> From: Cheng Xu <chengyou@xxxxxxxxxxxxxxxxx>
> >> Sent: Thursday, 14 July 2022 03:31
> >> To: jgg@xxxxxxxx; leon@xxxxxxxxxx; Bernard Metzler <BMT@xxxxxxxxxxxxxx>
> >> Cc: linux-rdma@xxxxxxxxxxxxxxx; chengyou@xxxxxxxxxxxxxxxxx
> >> Subject: [EXTERNAL] [PATCH for-next] RDMA/siw: Fix duplicated reported
> >> IW_CM_EVENT_CONNECT_REPLY event
> >>
> >> If siw_recv_mpa_rr returns -EAGAIN, it means that the MPA reply hasn't
> >> been received completely, and should not report
> IW_CM_EVENT_CONNECT_REPLY
> >> in this case. This may trigger a call trace in iw_cm. A simple way to
> >> trigger this:
> >
> > Great, thanks! I obviously did never hit an incomplete
> > MPA hdr. Please make another change to fix it correctly,
> > as suggested below.
> >
> >
> > case of an incomplete
> >>  server: ib_send_lat
> >>  client: ib_send_lat -R <server_ip>
> >>
> >> The call trace looks like this:
> >>
> >>  kernel BUG at drivers/infiniband/core/iwcm.c:894!
> >>  invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> >>  <...>
> >>  Workqueue: iw_cm_wq cm_work_handler [iw_cm]
> >>  Call Trace:
> >>   <TASK>
> >>   cm_work_handler+0x1dd/0x370 [iw_cm]
> >>   process_one_work+0x1e2/0x3b0
> >>   worker_thread+0x49/0x2e0
> >>   ? rescuer_thread+0x370/0x370
> >>   kthread+0xe5/0x110
> >>   ? kthread_complete_and_exit+0x20/0x20
> >>   ret_from_fork+0x1f/0x30
> >>   </TASK>
> >>
> >> Signed-off-by: Cheng Xu <chengyou@xxxxxxxxxxxxxxxxx>
> >> ---
> >>  drivers/infiniband/sw/siw/siw_cm.c | 7 ++++---
> >>  1 file changed, 4 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/infiniband/sw/siw/siw_cm.c
> >> b/drivers/infiniband/sw/siw/siw_cm.c
> >> index 17f34d584cd9..f88d2971c2c6 100644
> >> --- a/drivers/infiniband/sw/siw/siw_cm.c
> >> +++ b/drivers/infiniband/sw/siw/siw_cm.c
> >> @@ -725,11 +725,11 @@ static int siw_proc_mpareply(struct siw_cep *cep)
> >>  	enum mpa_v2_ctrl mpa_p2p_mode = MPA_V2_RDMA_NO_RTR;
> >>
> >>  	rv = siw_recv_mpa_rr(cep);
> >> -	if (rv != -EAGAIN)
> >> -		siw_cancel_mpatimer(cep);
> >>  	if (rv)
> >>  		goto out_err;
> >>
> >> +	siw_cancel_mpatimer(cep);
> >> +
> >
> > Cancel the MPA timer only if we have a
> > real error. -EAGAIN translates to just
> > further waiting. So best to add the timer
> > cancellation to the error bailout section.
> >
> >>  	rep = &cep->mpa.hdr;
> >>
> >>  	if (__mpa_rr_revision(rep->params.bits) > MPA_REVISION_2) {
> >> @@ -895,7 +895,8 @@ static int siw_proc_mpareply(struct siw_cep *cep)
> >>  	}
> >>
> >>  out_err:
> >> -	siw_cm_upcall(cep, IW_CM_EVENT_CONNECT_REPLY, -EINVAL);
> >> +	if (rv != -EAGAIN)
> > {
> > cancel MPA timer here.
> 
> Indeed we do not need it here, because when siw_proc_mpareply returns error
> but not -EAGAIN, the release_cep will be set in the caller
> (siw_cm_work_handler),
> and siw_cancel_mpatimer will be called in the error handle flow.
> 
> I think this is better, because the error handle is more unified.

Yes, sorry, your original patch is correct.

> 
> How do you think?
> 
> Thanks,
> Cheng Xu
> 
> 
> > 		siw_cancel_mpatimer(cep);
> >> +		siw_cm_upcall(cep, IW_CM_EVENT_CONNECT_REPLY, -EINVAL);
> > }
> >>
> >>  	return rv;
> >>  }
> >> --
> >> 2.37.0
> >

Thank you!


Acked-by: Bernard Metzler <bmt@xxxxxxxxxxxxxx>




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux