Re: [PATCH] RDMA/rxe: Fix incomplete state save in rxe_requester

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 7/21/23 13:04, Bob Pearson wrote:
> If a send packet is dropped by the IP layer in rxe_requester()
> the call to rxe_xmit_packet() can fail with err == -EAGAIN.
> To recover, the state of the wqe is restored to the state before
> the packet was sent so it can be resent. However, the routines
> that save and restore the state miss a significnt part of the
> variable state in the wqe, the dma struct which is used to process
> through the sge table. And, the state is not saved before the packet
> is built which modifies the dma struct.
> 
> Under heavy stress testing with many QPs on a fast node sending
> large messages to a slow node dropped packets are observed and
> the resent packets are corrupted because the dma struct was not
> restored. This patch fixes this behavior and allows the test cases
> to succeed.
> 
> Fixes: 3050b9985024 ("IB/rxe: Fix race condition between requester and completer")
> Signed-off-by: Bob Pearson <rpearsonhpe@xxxxxxxxx>
> ---
>  drivers/infiniband/sw/rxe/rxe_req.c | 47 ++++++++++++++++-------------
>  1 file changed, 26 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/infiniband/sw/rxe/rxe_req.c b/drivers/infiniband/sw/rxe/rxe_req.c
> index 76d85731a45f..10ff2715d9bb 100644
> --- a/drivers/infiniband/sw/rxe/rxe_req.c
> +++ b/drivers/infiniband/sw/rxe/rxe_req.c
> @@ -708,10 +708,11 @@ static void save_state(struct rxe_send_wqe *wqe,
>  		       struct rxe_send_wqe *rollback_wqe,
>  		       u32 *rollback_psn)
>  {
> -	rollback_wqe->state     = wqe->state;
> +	rollback_wqe->state = wqe->state;
>  	rollback_wqe->first_psn = wqe->first_psn;
> -	rollback_wqe->last_psn  = wqe->last_psn;
> -	*rollback_psn		= qp->req.psn;
> +	rollback_wqe->last_psn = wqe->last_psn;
> +	rollback_wqe->dma = wqe->dma;
> +	*rollback_psn = qp->req.psn;
>  }
>  
>  static void rollback_state(struct rxe_send_wqe *wqe,
> @@ -719,10 +720,11 @@ static void rollback_state(struct rxe_send_wqe *wqe,
>  			   struct rxe_send_wqe *rollback_wqe,
>  			   u32 rollback_psn)
>  {
> -	wqe->state     = rollback_wqe->state;
> +	wqe->state = rollback_wqe->state;
>  	wqe->first_psn = rollback_wqe->first_psn;
> -	wqe->last_psn  = rollback_wqe->last_psn;
> -	qp->req.psn    = rollback_psn;
> +	wqe->last_psn = rollback_wqe->last_psn;
> +	wqe->dma = rollback_wqe->dma;
> +	qp->req.psn = rollback_psn;
>  }
>  
>  static void update_state(struct rxe_qp *qp, struct rxe_pkt_info *pkt)
> @@ -922,6 +924,9 @@ int rxe_requester(struct rxe_qp *qp)
>  		payload = mtu;
>  	}
>  
> +	/* save wqe state before we build and send packet */
> +	save_state(wqe, qp, &rollback_wqe, &rollback_psn);
> +
>  	skb = rxe_init_req_packet(qp, wqe, opcode, payload, &pkt);
>  	if (unlikely(!skb)) {
>  		rxe_err_qp(qp, "failed to create packet for %s",
> @@ -929,29 +934,29 @@ int rxe_requester(struct rxe_qp *qp)
>  		goto err;
>  	}
>  
> -	/*
> -	 * To prevent a race on wqe access between requester and completer,
> -	 * wqe members state and psn need to be set before calling
> -	 * rxe_xmit_packet().
> -	 * Otherwise, completer might initiate an unjustified retry flow.
> -	 */
> -	save_state(wqe, qp, &rollback_wqe, &rollback_psn);
> +	/* update wqe state as though we had sent it */
>  	update_wqe_state(qp, wqe, &pkt);
>  	update_wqe_psn(qp, wqe, &pkt, payload);
>  
>  	err = rxe_xmit_packet(qp, &pkt, skb);
>  	if (err) {
> -		qp->need_req_skb = 1;
> +		if (err != -EAGAIN) {
> +			wqe->status = IB_WC_LOC_QP_OP_ERR;
> +			goto err;
> +		}
>  
> +		/* the packet was dropped so reset wqe to the state
> +		 * before we sent it so we can try to resend
> +		 */
>  		rollback_state(wqe, qp, &rollback_wqe, rollback_psn);
>  
> -		if (err == -EAGAIN) {
> -			rxe_sched_task(&qp->req.task);
> -			goto exit;
> -		}
> +		/* force a delay until the dropped packet is freed and
> +		 * the send queue is drained below the low water mark
> +		 */
> +		qp->need_req_skb = 1;
>  
> -		wqe->status = IB_WC_LOC_QP_OP_ERR;
> -		goto err;
> +		rxe_sched_task(&qp->req.task);
> +		goto exit;
>  	}
>  
>  	update_state(qp, &pkt);

Never mind. Missing some prereq patches. 



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux