On Mon, Mar 05, 2018 at 09:49:28PM -0800, Selvin Xavier wrote: > Hitting the following hardlockup due to a race condition in > error CQE processing. > > [26146.879798] bnxt_en 0000:04:00.0: QPLIB: FP: CQ Processed Req > [26146.886346] bnxt_en 0000:04:00.0: QPLIB: wr_id[1251] = 0x0 with status 0xa > [26156.350935] NMI watchdog: Watchdog detected hard LOCKUP on cpu 4 > [26156.357470] Modules linked in: nfsd auth_rpcgss nfs_acl lockd grace > [26156.447957] CPU: 4 PID: 3413 Comm: kworker/4:1H Kdump: loaded > [26156.457994] Hardware name: Dell Inc. PowerEdge R430/0CN7X8, > [26156.466390] Workqueue: ib-comp-wq ib_cq_poll_work [ib_core] > [26156.472639] Call Trace: > [26156.475379] <NMI> [<ffffffff98d0d722>] dump_stack+0x19/0x1b > [26156.481833] [<ffffffff9873f775>] watchdog_overflow_callback+0x135/0x140 > [26156.489341] [<ffffffff9877f237>] __perf_event_overflow+0x57/0x100 > [26156.496256] [<ffffffff98787c24>] perf_event_overflow+0x14/0x20 > [26156.502887] [<ffffffff9860a580>] intel_pmu_handle_irq+0x220/0x510 > [26156.509813] [<ffffffff98d16031>] perf_event_nmi_handler+0x31/0x50 > [26156.516738] [<ffffffff98d1790c>] nmi_handle.isra.0+0x8c/0x150 > [26156.523273] [<ffffffff98d17be8>] do_nmi+0x218/0x460 > [26156.528834] [<ffffffff98d16d79>] end_repeat_nmi+0x1e/0x7e > [26156.534980] [<ffffffff987089c0>] ? native_queued_spin_lock_slowpath+0x1d0/0x200 > [26156.543268] [<ffffffff987089c0>] ? native_queued_spin_lock_slowpath+0x1d0/0x200 > [26156.551556] [<ffffffff987089c0>] ? native_queued_spin_lock_slowpath+0x1d0/0x200 > [26156.559842] <EOE> [<ffffffff98d083e4>] queued_spin_lock_slowpath+0xb/0xf > [26156.567555] [<ffffffff98d15690>] _raw_spin_lock+0x20/0x30 > [26156.573696] [<ffffffffc08381a1>] bnxt_qplib_lock_buddy_cq+0x31/0x40 [bnxt_re] > [26156.581789] [<ffffffffc083bbaa>] bnxt_qplib_poll_cq+0x43a/0xf10 [bnxt_re] > [26156.589493] [<ffffffffc083239b>] bnxt_re_poll_cq+0x9b/0x760 [bnxt_re] > > The issue happens if RQ poll_cq or SQ poll_cq or Async error event tries to > put the error QP in flush list. Since SQ and RQ of each error qp are added > to two different flush list, we need to protect it using locks of > corresponding CQs. Difference in order of acquiring the lock in > SQ poll_cq and RQ poll_cq can cause a hard lockup. > > Revisits the locking strategy and removes the usage of qplib_cq.hwq.lock. > Instead of this lock, introduces qplib_cq.flush_lock to handle > addition/deletion of QPs in flush list. Also, always invoke the flush_lock > in order (SQ CQ lock first and then RQ CQ lock) to avoid any potential > deadlock. > > Other than the poll_cq context, the movement of QP to/from flush list can > be done in modify_qp context or from an async error event from HW. > Synchronize these operations using the bnxt_re verbs layer CQ locks. > To achieve this, adds a call back to the HW abstraction layer(qplib) to > bnxt_re ib_verbs layer in case of async error event. Also, removes the > buddy cq functions as it is no longer required. > > Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@xxxxxxxxxxxx> > Signed-off-by: Somnath Kotur <somnath.kotur@xxxxxxxxxxxx> > Signed-off-by: Devesh Sharma <devesh.sharma@xxxxxxxxxxxx> > Signed-off-by: Selvin Xavier <selvin.xavier@xxxxxxxxxxxx> > drivers/infiniband/hw/bnxt_re/ib_verbs.c | 11 ++- > drivers/infiniband/hw/bnxt_re/ib_verbs.h | 3 + > drivers/infiniband/hw/bnxt_re/main.c | 7 ++ > drivers/infiniband/hw/bnxt_re/qplib_fp.c | 109 +++++++---------------------- > drivers/infiniband/hw/bnxt_re/qplib_fp.h | 12 ++++ > drivers/infiniband/hw/bnxt_re/qplib_rcfw.c | 3 +- > 6 files changed, 55 insertions(+), 90 deletions(-) Applied to for-next Thanks Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html