Re: [PATCH jgg-for-next] RDMA/rxe: Fix spinlock recursion deadlock on requester

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Apr 18, 2023 at 5:07 PM Daisuke Matsuda
<matsuda-daisuke@xxxxxxxxxxx> wrote:
>
> After applying commit f605f26ea196, the following deadlock is observed:
>  Call Trace:
>   <IRQ>
>   _raw_spin_lock_bh+0x29/0x30
>   check_type_state.constprop.0+0x4e/0xc0 [rdma_rxe]
>   rxe_rcv+0x173/0x3d0 [rdma_rxe]
>   rxe_udp_encap_recv+0x69/0xd0 [rdma_rxe]
>   ? __pfx_rxe_udp_encap_recv+0x10/0x10 [rdma_rxe]
>   udp_queue_rcv_one_skb+0x258/0x520
>   udp_unicast_rcv_skb+0x75/0x90
>   __udp4_lib_rcv+0x364/0x5c0
>   ip_protocol_deliver_rcu+0xa7/0x160
>   ip_local_deliver_finish+0x73/0xa0
>   ip_sublist_rcv_finish+0x80/0x90
>   ip_sublist_rcv+0x191/0x220
>   ip_list_rcv+0x132/0x160
>   __netif_receive_skb_list_core+0x297/0x2c0
>   netif_receive_skb_list_internal+0x1c5/0x300
>   napi_complete_done+0x6f/0x1b0
>   virtnet_poll+0x1f4/0x2d0 [virtio_net]
>   __napi_poll+0x2c/0x1b0
>   net_rx_action+0x293/0x350
>   ? __napi_schedule+0x79/0x90
>   __do_softirq+0xcb/0x2ab
>   __irq_exit_rcu+0xb9/0xf0
>   common_interrupt+0x80/0xa0
>   </IRQ>
>   <TASK>
>   asm_common_interrupt+0x22/0x40
>   RIP: 0010:_raw_spin_lock+0x17/0x30
>   rxe_requester+0xe4/0x8f0 [rdma_rxe]
>   ? xas_load+0x9/0xa0
>   ? xa_load+0x70/0xb0
>   do_task+0x64/0x1f0 [rdma_rxe]
>   rxe_post_send+0x54/0x110 [rdma_rxe]
>   ib_uverbs_post_send+0x5f8/0x680 [ib_uverbs]
>   ? netif_receive_skb_list_internal+0x1e3/0x300
>   ib_uverbs_write+0x3c8/0x500 [ib_uverbs]
>   vfs_write+0xc5/0x3b0
>   ksys_write+0xab/0xe0
>   ? syscall_trace_enter.constprop.0+0x126/0x1a0
>   do_syscall_64+0x3b/0x90
>   entry_SYSCALL_64_after_hwframe+0x72/0xdc
>   </TASK>
>
> The deadlock is easily reproducible with perftest. Fix it by disabling
> softirq when acquiring the lock in process context.

I am fine. Thanks.

Acked-by: Zhu Yanjun <zyjzyj2000@xxxxxxxxx>

Zhu Yanjun

>
> Fixes: f605f26ea196 ("RDMA/rxe: Protect QP state with qp->state_lock")
> Signed-off-by: Daisuke Matsuda <matsuda-daisuke@xxxxxxxxxxx>
> ---
>  drivers/infiniband/sw/rxe/rxe_req.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_req.c b/drivers/infiniband/sw/rxe/rxe_req.c
> index 8e50d116d273..65134a9aefe7 100644
> --- a/drivers/infiniband/sw/rxe/rxe_req.c
> +++ b/drivers/infiniband/sw/rxe/rxe_req.c
> @@ -180,13 +180,13 @@ static struct rxe_send_wqe *req_next_wqe(struct rxe_qp *qp)
>         if (wqe == NULL)
>                 return NULL;
>
> -       spin_lock(&qp->state_lock);
> +       spin_lock_bh(&qp->state_lock);
>         if (unlikely((qp_state(qp) == IB_QPS_SQD) &&
>                      (wqe->state != wqe_state_processing))) {
> -               spin_unlock(&qp->state_lock);
> +               spin_unlock_bh(&qp->state_lock);
>                 return NULL;
>         }
> -       spin_unlock(&qp->state_lock);
> +       spin_unlock_bh(&qp->state_lock);
>
>         wqe->mask = wr_opcode_mask(wqe->wr.opcode, qp);
>         return wqe;
> --
> 2.39.1
>




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux