Patch "RDMA/mlx5: Use xa_lock_irq when access to SRQ table" has been added to the 5.7-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    RDMA/mlx5: Use xa_lock_irq when access to SRQ table

to the 5.7-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     rdma-mlx5-use-xa_lock_irq-when-access-to-srq-table.patch
and it can be found in the queue-5.7 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit c2330696ae4be966ae060cbdfe6ee6cea0db8b6d
Author: Maor Gottlieb <maorg@xxxxxxxxxxxx>
Date:   Sun Jul 12 13:26:41 2020 +0300

    RDMA/mlx5: Use xa_lock_irq when access to SRQ table
    
    [ Upstream commit c3d6057e07a5d15be7c69ea545b3f91877808c96 ]
    
    SRQ table is accessed both from interrupt and process context,
    therefore we must use xa_lock_irq.
    
       inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
       kworker/u17:9/8573   takes:
       ffff8883e3503d30 (&xa->xa_lock#13){?...}-{2:2}, at: mlx5_cmd_get_srq+0x18/0x70 [mlx5_ib]
       {IN-HARDIRQ-W} state was registered at:
         lock_acquire+0xb9/0x3a0
         _raw_spin_lock+0x25/0x30
         srq_event_notifier+0x2b/0xc0 [mlx5_ib]
         notifier_call_chain+0x45/0x70
         __atomic_notifier_call_chain+0x69/0x100
         forward_event+0x36/0xc0 [mlx5_core]
         notifier_call_chain+0x45/0x70
         __atomic_notifier_call_chain+0x69/0x100
         mlx5_eq_async_int+0xc5/0x160 [mlx5_core]
         notifier_call_chain+0x45/0x70
         __atomic_notifier_call_chain+0x69/0x100
         mlx5_irq_int_handler+0x19/0x30 [mlx5_core]
         __handle_irq_event_percpu+0x43/0x2a0
         handle_irq_event_percpu+0x30/0x70
         handle_irq_event+0x34/0x60
         handle_edge_irq+0x7c/0x1b0
         do_IRQ+0x60/0x110
         ret_from_intr+0x0/0x2a
         default_idle+0x34/0x160
         do_idle+0x1ec/0x220
         cpu_startup_entry+0x19/0x20
         start_secondary+0x153/0x1a0
         secondary_startup_64+0xa4/0xb0
       irq event stamp: 20907
       hardirqs last  enabled at (20907):   _raw_spin_unlock_irq+0x24/0x30
       hardirqs last disabled at (20906):   _raw_spin_lock_irq+0xf/0x40
       softirqs last  enabled at (20746):   __do_softirq+0x2c9/0x436
       softirqs last disabled at (20681):   irq_exit+0xb3/0xc0
    
       other info that might help us debug this:
        Possible unsafe locking scenario:
    
              CPU0
              ----
         lock(&xa->xa_lock#13);
         <Interrupt>
           lock(&xa->xa_lock#13);
    
        *** DEADLOCK ***
    
       2 locks held by kworker/u17:9/8573:
        #0: ffff888295218d38 ((wq_completion)mlx5_ib_page_fault){+.+.}-{0:0}, at: process_one_work+0x1f1/0x5f0
        #1: ffff888401647e78 ((work_completion)(&pfault->work)){+.+.}-{0:0}, at: process_one_work+0x1f1/0x5f0
    
       stack backtrace:
       CPU: 0 PID: 8573 Comm: kworker/u17:9 Tainted: GO      5.7.0_for_upstream_min_debug_2020_06_14_11_31_46_41 #1
       Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
       Workqueue: mlx5_ib_page_fault mlx5_ib_eqe_pf_action [mlx5_ib]
       Call Trace:
        dump_stack+0x71/0x9b
        mark_lock+0x4f2/0x590
        ? print_shortest_lock_dependencies+0x200/0x200
        __lock_acquire+0xa00/0x1eb0
        lock_acquire+0xb9/0x3a0
        ? mlx5_cmd_get_srq+0x18/0x70 [mlx5_ib]
        _raw_spin_lock+0x25/0x30
        ? mlx5_cmd_get_srq+0x18/0x70 [mlx5_ib]
        mlx5_cmd_get_srq+0x18/0x70 [mlx5_ib]
        mlx5_ib_eqe_pf_action+0x257/0xa30 [mlx5_ib]
        ? process_one_work+0x209/0x5f0
        process_one_work+0x27b/0x5f0
        ? __schedule+0x280/0x7e0
        worker_thread+0x2d/0x3c0
        ? process_one_work+0x5f0/0x5f0
        kthread+0x111/0x130
        ? kthread_park+0x90/0x90
        ret_from_fork+0x24/0x30
    
    Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
    Link: https://lore.kernel.org/r/20200712102641.15210-1-leon@xxxxxxxxxx
    Signed-off-by: Maor Gottlieb <maorg@xxxxxxxxxxxx>
    Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxxxx>
    Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/drivers/infiniband/hw/mlx5/srq_cmd.c b/drivers/infiniband/hw/mlx5/srq_cmd.c
index 8fc3630a9d4c3..0224231a2e6f8 100644
--- a/drivers/infiniband/hw/mlx5/srq_cmd.c
+++ b/drivers/infiniband/hw/mlx5/srq_cmd.c
@@ -83,11 +83,11 @@ struct mlx5_core_srq *mlx5_cmd_get_srq(struct mlx5_ib_dev *dev, u32 srqn)
 	struct mlx5_srq_table *table = &dev->srq_table;
 	struct mlx5_core_srq *srq;
 
-	xa_lock(&table->array);
+	xa_lock_irq(&table->array);
 	srq = xa_load(&table->array, srqn);
 	if (srq)
 		refcount_inc(&srq->common.refcount);
-	xa_unlock(&table->array);
+	xa_unlock_irq(&table->array);
 
 	return srq;
 }



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux