On Mon, Jul 23, 2018 at 08:33:06PM +0300, Leon Romanovsky wrote: > On Sun, Jul 22, 2018 at 09:44:49PM -0600, Jason Gunthorpe wrote: > > On Tue, Jul 17, 2018 at 07:14:00AM +0300, Leon Romanovsky wrote: > > > On Tue, Jul 10, 2018 at 08:55:23PM -0600, Jason Gunthorpe wrote: > > > > From: Jason Gunthorpe <jgg@xxxxxxxxxxxx> > > > > > > > > We have a parallel unlocked reader and writer with ib_uverbs_get_context() > > > > vs everything else, and nothing guarentees this works properly. > > > > > > > > Audit and fix all of the places that access ucontext to use one of the > > > > following locking schemes: > > > > - Call ib_uverbs_get_ucontext() under SRCU and check for failure > > > > - Access the ucontext through an struct ib_uobject context member > > > > while holding a READ or WRITE lock on the uobject. > > > > This value cannot be NULL and has no race. > > > > - Hold the hw_destroy_rwsem and check for ufile->ucontext !NULL > > > > > > > > This also re-implements ib_uverbs_get_ucontext() in a way that is safe > > > > against concurrent ib_uverbs_get_context() and disassociation. > > > > > > > > As a side effect, every access to ucontext in the commands is via > > > > ib_uverbs_get_context() with an error check, or via the uobject, so there > > > > is no longer any need for the core code to check ucontext on every command > > > > call. These checks are also removed. > > > > > > > > Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxxxx> > > > > drivers/infiniband/core/rdma_core.c | 14 +++++-- > > > > drivers/infiniband/core/uverbs.h | 4 ++ > > > > drivers/infiniband/core/uverbs_cmd.c | 20 ++++++---- > > > > drivers/infiniband/core/uverbs_ioctl.c | 5 +-- > > > > drivers/infiniband/core/uverbs_main.c | 38 ++++++++++++------- > > > > drivers/infiniband/core/uverbs_std_types_cq.c | 2 +- > > > > drivers/infiniband/core/uverbs_std_types_dm.c | 2 +- > > > > drivers/infiniband/hw/mlx5/devx.c | 20 +++++++--- > > > > include/rdma/uverbs_ioctl.h | 8 ---- > > > > 9 files changed, 68 insertions(+), 45 deletions(-) > > > > > > > > > > This patch causes to races between cleanup and destroy. > > > > This doesn't reproduce for me with this test, but by inspection, this > > is racing IB_USER_VERBS_CMD_GET_CONTEXT with destroy (not cleanup vs > > destroy). > > > > The issue is changing the lock in ib_uverbs_get_context() triggers > > this. > > > > The next series fixes the WARN_ON by making the lock in the destroy > > path blocking and then just deleting the WARN_ON entirely. > > > > Otherwise, I'm not seeing an easy fix. This lock must be changed in > > get_context to eliminate the races around ucontext (the matching > > read-side is in alloc_uobj), and that rework is a necessary piece of > > removing the WARN_ON entirely. > > > > Since this only causes dmesg spam, can only be caused by deliberate > > mis-use of the verbs API, and is being fixed in the next series, I'm > > inclined to leave it as-is, or maybe make it a WARN_ON_ONCE or similar? > > > > What do you think? > > I would like to avoid exposing "race", especially that I have no > confidence that next series will close it or hide it. Well, I suggested deleting the WARN_ON here.. Sigh, actually there is a real problem here, if you race close on a comp channel with get_context the kernel will leak a uobject. Okay, I will come up with something else for this patch. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html