Re: [PATCH] RDMA/rxe: Fix responder length checking for UD request packets

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, May 28, 2024 at 01:10:00PM +0200, Zhu Yanjun wrote:
> On 24.05.24 03:52, Honggang LI wrote:
> > On Thu, May 23, 2024 at 05:03:12PM +0200, Zhu Yanjun wrote:
> > > Subject: Re: [PATCH] RDMA/rxe: Fix responder length checking for UD request
> > >   packets
> > > From: Zhu Yanjun <yanjun.zhu@xxxxxxxxx>
> > > Date: Thu, 23 May 2024 17:03:12 +0200
> > > 
> > > 
> > > On 23.05.24 14:06, Zhu Yanjun wrote:
> > > > 
> > > > On 23.05.24 11:46, Honggang LI wrote:
> > > > > According to the IBA specification:
> > > > > If a UD request packet is detected with an invalid length, the request
> > > > > shall be an invalid request and it shall be silently dropped by
> > > > > the responder. The responder then waits for a new request packet.
> > > > > 
> > > > > commit 689c5421bfe0 ("RDMA/rxe: Fix incorrect responder length
> > > > > checking")
> > > > > defers responder length check for UD QPs in function `copy_data`.
> > > > > But it introduces a regression issue for UD QPs.
> > > > > 
> > > > > When the packet size is too large to fit in the receive buffer.
> > > > > `copy_data` will return error code -EINVAL. Then `send_data_in`
> > > > > will return RESPST_ERR_MALFORMED_WQE. UD QP will transfer into
> > > > > ERROR state.
> > > > > 
> > > > > Fixes: 689c5421bfe0 ("RDMA/rxe: Fix incorrect responder length
> > > > > checking")
> > > > > Signed-off-by: Honggang LI <honggangli@xxxxxxx>
> > > > > ---
> > > > >    drivers/infiniband/sw/rxe/rxe_resp.c | 12 ++++++++++++
> > > > >    1 file changed, 12 insertions(+)
> > > > > 
> > > > > diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c
> > > > > b/drivers/infiniband/sw/rxe/rxe_resp.c
> > > > > index 963382f625d7..a74f29dcfdc9 100644
> > > > > --- a/drivers/infiniband/sw/rxe/rxe_resp.c
> > > > > +++ b/drivers/infiniband/sw/rxe/rxe_resp.c
> > > > > @@ -354,6 +354,18 @@ static enum resp_states
> > > > > rxe_resp_check_length(struct rxe_qp *qp,
> > > > >         * receive buffer later. For rmda operations additional
> > > > >         * length checks are performed in check_rkey.
> > > > >         */
> > > > > +    if ((qp_type(qp) == IB_QPT_GSI) || (qp_type(qp) == IB_QPT_UD)) {
> > > > 
> > > >  From IBA specification:
> > > > 
> > > > "
> > > > 
> > > > QP1, used for the General Services Interface (GSI).
> > > > •This QP uses the Unreliable Datagram transport service.
> > > > •All traffic to and from this QP uses any VL other than VL15.
> > > > •GSI packets arriving before the current packet’s command completes may
> > > > be dropped (i.e. the minimum queue depth of QP1 is one).
> > > > 
> > > > "
> > > > 
> > > > GSI should be MAD packets. And it should have a fixed format. Not sure
> > > > if the payload of GSI packets will exceed the size of the recv buffer.
> > 
> > It's dangerous to trust remote GSI request packets always fit in local
> > receive buffer. A well-designed hostile GSI request packet can render
> > remote QP1 into ERROR state. That means the remote node can't establish
> > new RC QP connections.
> 
> Thanks, Honggang.
> Based on our discussion, this seems to be a security problem. It seems that
> this problem is related with MLX5. Before MLX5 engineers jump into this
> problem, to RXE, this commit can avoid RXE hang in ERROR state.

Current RDMA network is designed with assumption that all participants
are trusted.

Thanks

> 
> LGTM.
> 
> Zhu Yanjun
> 
> > 
> > Thanks
> > 
> 




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux