Re: [PATCH rdma-rc] RDMA/mlx5: Release locks during notifier unregister

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2019-07-31 at 11:38 +0300, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@xxxxxxxxxxxx>
> 
> The below kernel panic was observed when created bond mode LACP
> with GRE tunnel on top. The reason to it was not released spinlock
> during mlx5 notify unregsiter sequence.
> 
> [  234.562007] BUG: scheduling while atomic: sh/10900/0x00000002
> [  234.563005] Preemption disabled at:
> [  234.566864] ------------[ cut here ]------------
> [  234.567120] DEBUG_LOCKS_WARN_ON(val > preempt_count())
> [  234.567139] WARNING: CPU: 16 PID: 10900 at kernel/sched/core.c:3203
> preempt_count_sub+0xca/0x170
> [  234.569550] CPU: 16 PID: 10900 Comm: sh Tainted: G        W 5.2.0-
> rc1-for-linust-dbg-2019-05-25_04-57-33-60 #1
> [  234.569886] Hardware name: Dell Inc. PowerEdge R720/0X3D66, BIOS
> 2.6.1 02/12/2018
> [  234.570183] RIP: 0010:preempt_count_sub+0xca/0x170
> [  234.570404] Code: 03 38
> d0 7c 08 84 d2 0f 85 b0 00 00 00 8b 15 dd 02 03 04 85 d2 75 ba 48 c7
> c6
> 00 e1 88 83 48 c7 c7 40 e1 88 83 e8 76 11 f7 ff <0f> 0b 5b c3 65 8b 05
> d3 1f d8 7e 84 c0 75 82 e8 62 c3 c3 00 85 c0
> [  234.570911] RSP: 0018:ffff888b94477b08 EFLAGS: 00010286
> [  234.571133] RAX: 0000000000000000 RBX: 0000000000000001 RCX:
> 0000000000000000
> [  234.571391] RDX: 0000000000000000 RSI: 0000000000000004 RDI:
> 0000000000000246
> [  234.571648] RBP: ffff888ba5560000 R08: fffffbfff08962d5 R09:
> fffffbfff08962d5
> [  234.571902] R10: 0000000000000001 R11: fffffbfff08962d4 R12:
> ffff888bac6e9548
> [  234.572157] R13: ffff888babfaf728 R14: ffff888bac6e9568 R15:
> ffff888babfaf750
> [  234.572412] FS: 00007fcafa59b740(0000) GS:ffff888bed200000(0000)
> knlGS:0000000000000000
> [  234.572686] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  234.572914] CR2: 00007f984f16b140 CR3: 0000000b2bf0a001 CR4:
> 00000000001606e0
> [  234.573172] Call Trace:
> [  234.573336] _raw_spin_unlock+0x2e/0x50
> [  234.573542] mlx5_ib_unbind_slave_port+0x1bc/0x690 [mlx5_ib]
> [  234.573793] mlx5_ib_cleanup_multiport_master+0x1d3/0x660 [mlx5_ib]
> [  234.574039] mlx5_ib_stage_init_cleanup+0x4c/0x360 [mlx5_ib]
> [  234.574271]  ? kfree+0xf5/0x2f0
> [  234.574465] __mlx5_ib_remove+0x61/0xd0 [mlx5_ib]
> [  234.574688]  ? __mlx5_ib_remove+0xd0/0xd0 [mlx5_ib]
> [  234.574951] mlx5_remove_device+0x234/0x300 [mlx5_core]
> [  234.575224] mlx5_unregister_device+0x4d/0x1e0 [mlx5_core]
> [  234.575493] remove_one+0x4f/0x160 [mlx5_core]
> [  234.575704] pci_device_remove+0xef/0x2a0
> [  234.581407]  ? pcibios_free_irq+0x10/0x10
> [  234.587143]  ? up_read+0xc1/0x260
> [  234.592785] device_release_driver_internal+0x1ab/0x430
> [  234.598442] unbind_store+0x152/0x200
> [  234.604064]  ? sysfs_kf_write+0x3b/0x180
> [  234.609441]  ? sysfs_file_ops+0x160/0x160
> [  234.615021] kernfs_fop_write+0x277/0x440
> [  234.620288]  ? __sb_start_write+0x1ef/0x2c0
> [  234.625512] vfs_write+0x15e/0x460
> [  234.630786] ksys_write+0x156/0x1e0
> [  234.635988]  ? __ia32_sys_read+0xb0/0xb0
> [  234.641120]  ? trace_hardirqs_off_thunk+0x1a/0x1c
> [  234.646163] do_syscall_64+0x95/0x470
> [  234.651106] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [  234.656004] RIP: 0033:0x7fcaf9c9cfd0
> [  234.660686] Code: 73 01
> c3 48 8b 0d c0 6e 2d 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00
> 00
> 83 3d cd cf 2d 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73
> 31 c3 48 83 ec 08 e8 ee cb 01 00 48 89 04 24
> [  234.670128] RSP: 002b:00007ffd3b01ddd8 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000001
> [  234.674811] RAX: ffffffffffffffda RBX: 000000000000000d RCX:
> 00007fcaf9c9cfd0
> [  234.679387] RDX: 000000000000000d RSI: 00007fcafa5c1000 RDI:
> 0000000000000001
> [  234.683848] RBP: 00007fcafa5c1000 R08: 000000000000000a R09:
> 00007fcafa59b740
> [  234.688167] R10: 00007ffd3b01d8e0 R11: 0000000000000246 R12:
> 00007fcaf9f75400
> [  234.692386] R13: 000000000000000d R14: 0000000000000001 R15:
> 0000000000000000
> [  234.696495] irq event stamp: 153067
> [  234.700525] hardirqs last enabled at (153067): [<ffffffff83258c39>]
> _raw_spin_unlock_irqrestore+0x59/0x70
> [  234.704665] hardirqs last disabled at (153066):
> [<ffffffff83259382>] _raw_spin_lock_irqsave+0x22/0x90
> [  234.708722] softirqs last enabled at (153058): [<ffffffff836006c5>]
> __do_softirq+0x6c5/0xb4e
> [  234.712673] softirqs last disabled at (153051):
> [<ffffffff81227c1d>] irq_exit+0x17d/0x1d0
> [  234.716601] ---[ end trace 5dbf096843ee9ce6 ]---
> 
> Fixes: df097a278c75 ("IB/mlx5: Use the new mlx5 core notifier API")
> Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxxxx>
> ---
>  drivers/infiniband/hw/mlx5/main.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/mlx5/main.c
> b/drivers/infiniband/hw/mlx5/main.c
> index c2a5780cb394..e12a4404096b 100644
> --- a/drivers/infiniband/hw/mlx5/main.c
> +++ b/drivers/infiniband/hw/mlx5/main.c
> @@ -5802,13 +5802,12 @@ static void mlx5_ib_unbind_slave_port(struct
> mlx5_ib_dev *ibdev,
>  		return;
>  	}
>  
> -	if (mpi->mdev_events.notifier_call)
> -		mlx5_notifier_unregister(mpi->mdev, &mpi->mdev_events);
> -	mpi->mdev_events.notifier_call = NULL;
> -
>  	mpi->ibdev = NULL;
>  
>  	spin_unlock(&port->mp.mpi_lock);
> +	if (mpi->mdev_events.notifier_call)
> +		mlx5_notifier_unregister(mpi->mdev, &mpi->mdev_events);
> +	mpi->mdev_events.notifier_call = NULL;

I can see where this fixes the problem at hand, but this gives the
appearance of creating a new race.  Doing a check/unregister/set-null
series outside of any locks is a red flag to someone investigating the
code.  You should at least make note of the fact that calling unregister
more than once is safe.  If you're fine with it, I can add a comment and
take the patch, or you can resubmit.

>  	mlx5_remove_netdev_notifier(ibdev, port_num);
>  	spin_lock(&port->mp.mpi_lock);
>  

-- 
Doug Ledford <dledford@xxxxxxxxxx>
    GPG KeyID: B826A3330E572FDD
    Fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD

Attachment: signature.asc
Description: This is a digitally signed message part


[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux