Re: [PATCH 2/2] ksmbd: smbd: handle RDMA CM time wait event

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



2022년 6월 16일 (목) 오전 3:53, Tom Talpey <tom@xxxxxxxxxx>님이 작성:
>
>
> On 6/14/2022 10:14 PM, Hyunchul Lee wrote:
> > 2022년 6월 14일 (화) 오후 8:56, Tom Talpey <tom@xxxxxxxxxx>님이 작성:
> >>
> >>
> >> On 6/13/2022 7:01 PM, Hyunchul Lee wrote:
> >>> After a QP has been disconnected, it stays
> >>> in a timewait state for in flight packets.
> >>> After the state has completed,
> >>> RDMA_CM_EVENT_TIMEWAIT_EXIT is reported.
> >>> Disconnect on RDMA_CM_EVENT_TIMEWAIT_EXIT
> >>> so that ksmbd can restart.
> >>>
> >>> Signed-off-by: Hyunchul Lee <hyc.lee@xxxxxxxxx>
> >>> ---
> >>>    fs/ksmbd/transport_rdma.c | 1 +
> >>>    1 file changed, 1 insertion(+)
> >>>
> >>> diff --git a/fs/ksmbd/transport_rdma.c b/fs/ksmbd/transport_rdma.c
> >>> index d035e060c2f0..4b1a471afcd0 100644
> >>> --- a/fs/ksmbd/transport_rdma.c
> >>> +++ b/fs/ksmbd/transport_rdma.c
> >>> @@ -1535,6 +1535,7 @@ static int smb_direct_cm_handler(struct rdma_cm_id *cm_id,
> >>>                wake_up_interruptible(&t->wait_status);
> >>>                break;
> >>>        }
> >>> +     case RDMA_CM_EVENT_TIMEWAIT_EXIT:
> >>>        case RDMA_CM_EVENT_DEVICE_REMOVAL:
> >>>        case RDMA_CM_EVENT_DISCONNECTED: {
> >>>                t->status = SMB_DIRECT_CS_DISCONNECTED;
> >>
> >> Is this issue seen on all RDMA providers? Because I would normally
> >> expect that an RDMA_CM_EVENT_DISCONNECTED will precede the TIMEWAIT
> >> event. What scenarios have you seen this not occur?
> >>
> >
> > There was an issue that ksmbd got stuck after attempting to shutdown.
> > We are trying to reproduce it, but we haven't reproduced it yet,
> > but It seems to be related to the TIMEWAIT event.
>
> I don't think it's appropriate to add this case to SMB. I think it's
> quite unlikely that it will address anything, because an RDMA provider
> must have indicated a CM_EVENT_DISCONNECTED prior to any TIMEWAIT.
> So, the QP (and connection) will already have been torn down by ksmbd
> at the earlier event. Perhaps ksmbd did not properly drain the QP at
> the initial disconnect.
>
>  > And other drivers such as nvme have disconnected on the TIMEWAIT event.
>
> NVME is a completely different upper layer, and has different client/
> server transport behavior. The SMB session insulates its peers from
> most transport errors, and should not be requesting timewait for
> its connections, and definitely not waiting for timewait to expire
> before initiating teardown (or recovery). The NFS/RDMA client and
> server ignore this event, btw.
>

Okay, I got it.
I am looking for the cause and have found some clues.

> >> Unless ksmbd wishes to reuse its QP's, which is not currently the
> >> case (right?), there's pretty much no reason to manage QP state and
> >> hang around for TIMEWAIT.
> >
> > Right, ksmbd doesn't reuse QP.
>
> Then there appears to be no good justification for the change. Sorry,
> but it's a NAK from me.
>

Really thank you for the detailed explanation.

> Tom.



-- 
Thanks,
Hyunchul




[Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux