From: Yishai Hadas <yishaih@xxxxxxxxxxxx> The async_file might be freed before that the disassociation has been ended. uverbs_destroy_ufile_hw is not a fence, it returns if a disassociation is ongoing in another thread. It has to be written this way to avoid deadlock. However this means that the ufile FD close cannot destroy anything that may still be used by an active kref, such as the the async_file. To fix that we moved the kref_put() to be in ib_uverbs_release_file(). [69306.073743] BUG: unable to handle kernel paging request at ffffffffba682787 [69306.077265] PGD bc80e067 P4D bc80e067 PUD bc80f063 PMD 1313df163 PTE 80000000bc682061 [69306.079781] Oops: 0003 [#1] SMP PTI [69306.081140] CPU: 1 PID: 32410 Comm: bash Tainted: G OE 4.20.0-rc6+ #3 [69306.083555] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [69306.085663] RIP: 0010:__pv_queued_spin_lock_slowpath+0x1b3/0x2a0 [69306.087833] Code: 98 83 e2 60 49 89 df 48 8b 04 c5 80 18 72 ba 48 8d ba 80 32 02 00 ba 00 80 00 00 4c 8d 65 14 41 bd 01 00 00 00 48 01 c7 85 d2 <48> 89 2f 48 89 fb 74 14 8b 45 08 85 c0 75 42 84 d2 74 6b f3 90 83 [69306.093895] RSP: 0018:ffffc1bbc064fb58 EFLAGS: 00010006 [69306.095951] RAX: ffffffffba65f4e7 RBX: ffff9f209c656c00 RCX: 0000000000000001 [69306.098520] RDX: 0000000000008000 RSI: 0000000000000000 RDI: ffffffffba682787 [69306.101064] RBP: ffff9f217bb23280 R08: 0000000000000001 R09: 0000000000000000 [69306.103644] R10: ffff9f209d2c7800 R11: ffffffffffffffe8 R12: ffff9f217bb23294 [69306.106167] R13: 0000000000000001 R14: 0000000000000000 R15: ffff9f209c656c00 [69306.108689] FS: 00007fac55aad740(0000) GS:ffff9f217bb00000(0000) knlGS:0000000000000000 [69306.111503] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [69306.113651] CR2: ffffffffba682787 CR3: 000000012f8e0000 CR4: 00000000000006e0 [69306.116195] Call Trace: [69306.117546] _raw_spin_lock_irq+0x27/0x30 [69306.119299] ib_uverbs_release_uevent+0x1e/0xa0 [ib_uverbs] [69306.121439] uverbs_free_qp+0x7e/0x90 [ib_uverbs] [69306.123354] destroy_hw_idr_uobject+0x1c/0x50 [ib_uverbs] [69306.125454] uverbs_destroy_uobject+0x2e/0x180 [ib_uverbs] [69306.127564] __uverbs_cleanup_ufile+0x73/0x90 [ib_uverbs] [69306.129639] uverbs_destroy_ufile_hw+0x5d/0x120 [ib_uverbs] [69306.131770] ib_uverbs_remove_one+0xea/0x240 [ib_uverbs] [69306.133878] ib_unregister_device+0xfb/0x200 [ib_core] [69306.135906] mlx5_ib_remove+0x51/0xe0 [mlx5_ib] [69306.137820] mlx5_remove_device+0xc1/0xd0 [mlx5_core] [69306.139797] mlx5_unregister_device+0x3d/0xb0 [mlx5_core] [69306.141829] remove_one+0x2a/0x90 [mlx5_core] [69306.143611] pci_device_remove+0x3b/0xc0 [69306.145248] device_release_driver_internal+0x16d/0x240 [69306.147251] unbind_store+0xb2/0x100 [69306.148772] kernfs_fop_write+0x102/0x180 [69306.150446] __vfs_write+0x36/0x1a0 [69306.151929] ? __alloc_fd+0xa9/0x170 [69306.153426] ? set_close_on_exec+0x49/0x70 [69306.155027] vfs_write+0xad/0x1a0 [69306.156463] ksys_write+0x52/0xc0 [69306.157873] do_syscall_64+0x5b/0x180 [69306.159393] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [69306.161262] RIP: 0033:0x7fac551aac60 [69306.162750] Code: 73 01 c3 48 8b 0d 30 62 2d 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 3d c3 2d 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 ee cb 01 00 48 89 04 24 [69306.168858] RSP: 002b:00007ffd167b3568 EFLAGS: 00000246 ORIG_RAX:0000000000000001 [69306.171448] RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007fac551aac60 [69306.173950] RDX: 000000000000000d RSI: 00007fac55ad5000 RDI: 0000000000000001 [69306.176448] RBP: 00007fac55ad5000 R08: 000000000000000a R09: 00007fac55aad740 [69306.178919] R10: 00007fac55aad740 R11: 0000000000000246 R12: 00007fac55482400 [69306.181423] R13: 000000000000000d R14: 0000000000000001 R15: 0000000000000000 [69306.183904] Modules linked in: netconsole nfsv3 nfs_acl rdma_ucm rdma_cm iw_cm ib_ipoib ib_cm ib_umad mlx5_ib(OE) mlx5_core(OE) mlxfw mlx4_en mlx4_ib ib_uverbs(OE) ib_core mlx4_core devlink rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache ipmi_devintf ipmi_msghandler sunrpc dm_mirror dm_region_hash dm_log dm_mod crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper joydev pcspkr virtio_balloon sg i2c_piix4 ip_tables ext4 mbcache jbd2 sd_mod ata_generic pata_acpi cirrus drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm virtio_net net_failover ata_piix virtio_console failover libata floppy virtio_pci serio_raw crc32c_intel i2c_core virtio_ring virtio [last unloaded: mlxfw] Cc: <stable@xxxxxxxxxxxxxxx> # 4.2 Fixes: 036b10635739 ("IB/uverbs: Enable device removal when there are active user space applications") Signed-off-by: Yishai Hadas <yishaih@xxxxxxxxxxxx> Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxxxx> --- drivers/infiniband/core/uverbs_main.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c index 2890a77339e1..15add0688fbb 100644 --- a/drivers/infiniband/core/uverbs_main.c +++ b/drivers/infiniband/core/uverbs_main.c @@ -204,6 +204,9 @@ void ib_uverbs_release_file(struct kref *ref) if (atomic_dec_and_test(&file->device->refcount)) ib_uverbs_comp_dev(file->device); + if (file->async_file) + kref_put(&file->async_file->ref, + ib_uverbs_release_async_event_file); put_device(&file->device->dev); kfree(file); } @@ -1096,10 +1099,6 @@ static int ib_uverbs_close(struct inode *inode, struct file *filp) list_del_init(&file->list); mutex_unlock(&file->device->lists_mutex); - if (file->async_file) - kref_put(&file->async_file->ref, - ib_uverbs_release_async_event_file); - kref_put(&file->ref, ib_uverbs_release_file); return 0; -- 2.19.1