On Thu, May 23, 2024 at 05:03:12PM +0200, Zhu Yanjun wrote: > Subject: Re: [PATCH] RDMA/rxe: Fix responder length checking for UD request > packets > From: Zhu Yanjun <yanjun.zhu@xxxxxxxxx> > Date: Thu, 23 May 2024 17:03:12 +0200 > > > On 23.05.24 14:06, Zhu Yanjun wrote: > > > > On 23.05.24 11:46, Honggang LI wrote: > > > According to the IBA specification: > > > If a UD request packet is detected with an invalid length, the request > > > shall be an invalid request and it shall be silently dropped by > > > the responder. The responder then waits for a new request packet. > > > > > > commit 689c5421bfe0 ("RDMA/rxe: Fix incorrect responder length > > > checking") > > > defers responder length check for UD QPs in function `copy_data`. > > > But it introduces a regression issue for UD QPs. > > > > > > When the packet size is too large to fit in the receive buffer. > > > `copy_data` will return error code -EINVAL. Then `send_data_in` > > > will return RESPST_ERR_MALFORMED_WQE. UD QP will transfer into > > > ERROR state. > > > > > > Fixes: 689c5421bfe0 ("RDMA/rxe: Fix incorrect responder length > > > checking") > > > Signed-off-by: Honggang LI <honggangli@xxxxxxx> > > > --- > > > drivers/infiniband/sw/rxe/rxe_resp.c | 12 ++++++++++++ > > > 1 file changed, 12 insertions(+) > > > > > > diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c > > > b/drivers/infiniband/sw/rxe/rxe_resp.c > > > index 963382f625d7..a74f29dcfdc9 100644 > > > --- a/drivers/infiniband/sw/rxe/rxe_resp.c > > > +++ b/drivers/infiniband/sw/rxe/rxe_resp.c > > > @@ -354,6 +354,18 @@ static enum resp_states > > > rxe_resp_check_length(struct rxe_qp *qp, > > > * receive buffer later. For rmda operations additional > > > * length checks are performed in check_rkey. > > > */ > > > + if ((qp_type(qp) == IB_QPT_GSI) || (qp_type(qp) == IB_QPT_UD)) { > > > > From IBA specification: > > > > " > > > > QP1, used for the General Services Interface (GSI). > > •This QP uses the Unreliable Datagram transport service. > > •All traffic to and from this QP uses any VL other than VL15. > > •GSI packets arriving before the current packet’s command completes may > > be dropped (i.e. the minimum queue depth of QP1 is one). > > > > " > > > > GSI should be MAD packets. And it should have a fixed format. Not sure > > if the payload of GSI packets will exceed the size of the recv buffer. It's dangerous to trust remote GSI request packets always fit in local receive buffer. A well-designed hostile GSI request packet can render remote QP1 into ERROR state. That means the remote node can't establish new RC QP connections. Thanks