Re: [PATCH rdma-next v1] RDMA/core: Sync unregistration with netlink commands

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Nov 16, 2018 at 03:50:57AM +0200, Leon Romanovsky wrote:
> From: Parav Pandit <parav@xxxxxxxxxxxx>
> 
> Currently when rdma device is getting removed, get resource info can
> race with device removal example below.
> 
>       CPU-0                                  CPU-1
>     --------                               --------
>     rdma_nl_rcv_msg()
>        nldev_res_get_cq_dumpit()
>           mutex_lock(device_lock);
>           get device reference
>           mutex_unlock(device_lock);        [..]
>                                             ib_unregister_device()
>                                             /* Valid reference to
>                                              * device->dev exists.
>                                              */
>                                              ib_dealloc_device()
> 
>           [..]
>           provider->fill_res_entry();
> 
> Even though device object is not freed, fill_res_entry() can get called
> on device which doesn't from provider driver side.
> Kernel core device reference count is not sufficient.
> 
> Similar race can occur with device renaming and device removal, where
> device_rename() tries to rename a unregistered device. While this is fine
> for devices of a class which are not net namespace aware, but it is
> incorrect for net namespace aware class coming in subsequent series.
> If a class is net namespace aware, than below [1] call trace is observed
> in above situation.
> 
> Therefore, to avoid the such race, keep a reference count and let device
> unregistration wait until all netlink users drop the reference.
> 
> [1] Call trace:
> kernfs: ns required in 'infiniband' for 'mlx5_0'
> WARNING: CPU: 18 PID: 44270 at fs/kernfs/dir.c:842 kernfs_find_ns+0x104/0x120
> libahci i2c_core mlxfw libata dca [last unloaded: devlink]
> RIP: 0010:kernfs_find_ns+0x104/0x120
> Call Trace:
> kernfs_find_and_get_ns+0x2e/0x50
> sysfs_rename_link_ns+0x40/0xb0
> device_rename+0xb2/0xf0
> ib_device_rename+0xb3/0x100 [ib_core]
> nldev_set_doit+0x165/0x190 [ib_core]
> rdma_nl_rcv_msg+0x249/0x250 [ib_core]
> ? netlink_deliver_tap+0x8f/0x3e0
> rdma_nl_rcv+0xd6/0x120 [ib_core]
> netlink_unicast+0x17c/0x230
> netlink_sendmsg+0x2f0/0x3e0
> sock_sendmsg+0x30/0x40
> __sys_sendto+0xdc/0x160
> 
> Fixes: da5c85078215 ("RDMA/nldev: add driver-specific resource tracking")
> Signed-off-by: Parav Pandit <parav@xxxxxxxxxxxx>
> Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxxxx>
> ---
> Changelog v0->v1:
>  * Fixed type in comment
>  * Rephrased comment
> ---
>  drivers/infiniband/core/core_priv.h |  1 +
>  drivers/infiniband/core/device.c    | 27 +++++++++++++++++++++++----
>  drivers/infiniband/core/nldev.c     | 20 ++++++++++----------
>  include/rdma/ib_verbs.h             |  8 +++++++-
>  4 files changed, 41 insertions(+), 15 deletions(-)
> 
> --
> 2.19.1

This patch also doesn't apply, but the fix seemed simple enough. Again
please check. I also reworded some of the comments.

Applying: RDMA/core: Sync unregistration with netlink commands
Using index info to reconstruct a base tree...
M	drivers/infiniband/core/core_priv.h
M	drivers/infiniband/core/device.c
M	include/rdma/ib_verbs.h
Falling back to patching base and 3-way merge...
Auto-merging include/rdma/ib_verbs.h
Auto-merging drivers/infiniband/core/device.c
CONFLICT (content): Merge conflict in drivers/infiniband/core/device.c
Auto-merging drivers/infiniband/core/core_priv.h
error: Failed to merge in the changes.

Applied to for-next

Thanks,
Jason



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux