Re: [PATCH rdma-rc] IB/uverbs: Fix OOPs upon device disassociation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 24, 2019 at 02:33:12PM +0200, Leon Romanovsky wrote:
> From: Yishai Hadas <yishaih@xxxxxxxxxxxx>
> 
> The async_file might be freed before that the disassociation has been
> ended.
> 
> uverbs_destroy_ufile_hw is not a fence, it returns if a disassociation
> is ongoing in another thread. It has to be written this way to avoid
> deadlock. However this means that the ufile FD close cannot destroy
> anything that may still be used by an active kref, such as the the
> async_file.
> 
> To fix that we moved the kref_put() to be in ib_uverbs_release_file().
> 
> [69306.073743] BUG: unable to handle kernel paging request at ffffffffba682787
> [69306.077265] PGD bc80e067 P4D bc80e067 PUD bc80f063 PMD 1313df163 PTE 80000000bc682061
> [69306.079781] Oops: 0003 [#1] SMP PTI
> [69306.081140] CPU: 1 PID: 32410 Comm: bash Tainted: G           OE 4.20.0-rc6+ #3
> [69306.083555] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
> [69306.085663] RIP: 0010:__pv_queued_spin_lock_slowpath+0x1b3/0x2a0
> [69306.087833] Code: 98 83 e2 60 49 89 df 48 8b 04 c5 80 18 72 ba 48 8d
> 		ba 80 32 02 00 ba 00 80 00 00 4c 8d 65 14 41 bd 01 00 00 00 48 01 c7 85
> 		d2 <48> 89 2f 48 89 fb 74 14 8b 45 08 85 c0 75 42 84 d2 74 6b f3 90 83
> [69306.093895] RSP: 0018:ffffc1bbc064fb58 EFLAGS: 00010006
> [69306.095951] RAX: ffffffffba65f4e7 RBX: ffff9f209c656c00 RCX: 0000000000000001
> [69306.098520] RDX: 0000000000008000 RSI: 0000000000000000 RDI: ffffffffba682787
> [69306.101064] RBP: ffff9f217bb23280 R08: 0000000000000001 R09: 0000000000000000
> [69306.103644] R10: ffff9f209d2c7800 R11: ffffffffffffffe8 R12: ffff9f217bb23294
> [69306.106167] R13: 0000000000000001 R14: 0000000000000000 R15: ffff9f209c656c00
> [69306.108689] FS:  00007fac55aad740(0000) GS:ffff9f217bb00000(0000) knlGS:0000000000000000
> [69306.111503] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [69306.113651] CR2: ffffffffba682787 CR3: 000000012f8e0000 CR4: 00000000000006e0
> [69306.116195] Call Trace:
> [69306.117546]  _raw_spin_lock_irq+0x27/0x30
> [69306.119299]  ib_uverbs_release_uevent+0x1e/0xa0 [ib_uverbs]
> [69306.121439]  uverbs_free_qp+0x7e/0x90 [ib_uverbs]
> [69306.123354]  destroy_hw_idr_uobject+0x1c/0x50 [ib_uverbs]
> [69306.125454]  uverbs_destroy_uobject+0x2e/0x180 [ib_uverbs]
> [69306.127564]  __uverbs_cleanup_ufile+0x73/0x90 [ib_uverbs]
> [69306.129639]  uverbs_destroy_ufile_hw+0x5d/0x120 [ib_uverbs]
> [69306.131770]  ib_uverbs_remove_one+0xea/0x240 [ib_uverbs]
> [69306.133878]  ib_unregister_device+0xfb/0x200 [ib_core]
> [69306.135906]  mlx5_ib_remove+0x51/0xe0 [mlx5_ib]
> [69306.137820]  mlx5_remove_device+0xc1/0xd0 [mlx5_core]
> [69306.139797]  mlx5_unregister_device+0x3d/0xb0 [mlx5_core]
> [69306.141829]  remove_one+0x2a/0x90 [mlx5_core]
> [69306.143611]  pci_device_remove+0x3b/0xc0
> [69306.145248]  device_release_driver_internal+0x16d/0x240
> [69306.147251]  unbind_store+0xb2/0x100
> [69306.148772]  kernfs_fop_write+0x102/0x180
> [69306.150446]  __vfs_write+0x36/0x1a0
> [69306.151929]  ? __alloc_fd+0xa9/0x170
> [69306.153426]  ? set_close_on_exec+0x49/0x70
> [69306.155027]  vfs_write+0xad/0x1a0
> [69306.156463]  ksys_write+0x52/0xc0
> [69306.157873]  do_syscall_64+0x5b/0x180
> [69306.159393]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [69306.161262] RIP: 0033:0x7fac551aac60
> [69306.162750] Code: 73 01 c3 48 8b 0d 30 62 2d 00 f7 d8 64 89 01 48 83
> 		c8 ff c3 66 0f 1f 44 00 00 83 3d 3d c3 2d 00 00 75 10 b8 01 00 00 00 0f
> 		05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 ee cb 01 00 48 89 04 24
> [69306.168858] RSP: 002b:00007ffd167b3568 EFLAGS: 00000246 ORIG_RAX:0000000000000001
> [69306.171448] RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007fac551aac60
> [69306.173950] RDX: 000000000000000d RSI: 00007fac55ad5000 RDI: 0000000000000001
> [69306.176448] RBP: 00007fac55ad5000 R08: 000000000000000a R09: 00007fac55aad740
> [69306.178919] R10: 00007fac55aad740 R11: 0000000000000246 R12: 00007fac55482400
> [69306.181423] R13: 000000000000000d R14: 0000000000000001 R15: 0000000000000000
> [69306.183904] Modules linked in: netconsole nfsv3 nfs_acl rdma_ucm
> rdma_cm iw_cm ib_ipoib ib_cm ib_umad mlx5_ib(OE) mlx5_core(OE) mlxfw
> mlx4_en mlx4_ib ib_uverbs(OE) ib_core mlx4_core devlink rpcsec_gss_krb5
> auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache ipmi_devintf
> ipmi_msghandler sunrpc dm_mirror dm_region_hash dm_log dm_mod
> crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
> crypto_simd cryptd glue_helper joydev pcspkr virtio_balloon sg i2c_piix4
> ip_tables ext4 mbcache jbd2 sd_mod ata_generic pata_acpi cirrus
> drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm
> virtio_net net_failover ata_piix virtio_console failover libata floppy
> virtio_pci serio_raw crc32c_intel i2c_core virtio_ring virtio [last unloaded: mlxfw]
> 
> Cc: <stable@xxxxxxxxxxxxxxx> # 4.2
> Fixes: 036b10635739 ("IB/uverbs: Enable device removal when there are active user space applications")
> Signed-off-by: Yishai Hadas <yishaih@xxxxxxxxxxxx>
> Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxxxx>
> ---
>  drivers/infiniband/core/uverbs_main.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)

Applied to for-next

Thanks,
Jason



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux