Re: [PATCH V2] IB/uverbs: Fix race between uverbs_close and remove_one

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 8, 2016 at 12:38 AM, Jason Gunthorpe
<jgunthorpe@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Mon, Mar 07, 2016 at 04:44:33AM -0500, Devesh Sharma wrote:
>
>> [67140.260665]  [<ffffffff810c16a0>] ? prepare_to_wait_event+0xf0/0xf0
>> [67140.268337]  [<ffffffffa04cabc3>] ? ib_dereg_mr+0x23/0x30 [ib_core]
>
> So, ib_dereg_mr at this point:
>
>         ret = mr->device->dereg_mr(mr);
>
> Is running when mr->device is already freed?

Yes.

>
>> During rmmod <vendor-driver> "ib_uverbs_close()" context is
>> still running, while "ib_uverbs_remove_one()" context completes and
>> ends up freeing ib_dev pointer, thus causing a Kernel Panic.
>
> Hurm..
>
> So ib_uverbs_close is busy running in ib_uverbs_cleanup_ucontext and
> then ib_uverbs_free_hw_resources is called?

Yes, and completed also to unblock ib_unregister_device() which actually free-up
device pointer.

>
> At first blush it certainly looks like the locking around
> ib_uverbs_cleanup_context is wrong.

I agree, from both locations it is called without any synchronization.

>
>> This patch fixes the race. ib_uverbs_close validates dev->ib_dev against NULL
>> inside an srcu lock. If it is NULL, it waits for a completion and drops the srcu
>> else continues with the normal flow.
>
> Hum.. So this is really weird, this patch is bascially duplicating a
> mutex with srcu and a completion??

Agreed.

>
> What is wrong with simply this:
>
> --- a/drivers/infiniband/core/uverbs_main.c
> +++ b/drivers/infiniband/core/uverbs_main.c
> @@ -962,9 +962,9 @@ static int ib_uverbs_close(struct inode *inode, struct file *filp)
>                 list_del(&file->list);
>                 file->is_closed = 1;
>         }
> -       mutex_unlock(&file->device->lists_mutex);
>         if (ucontext)
>                 ib_uverbs_cleanup_ucontext(file, ucontext);
> +       mutex_unlock(&file->device->lists_mutex);
>
>
> ??

There is following comment about list_mutex in uverbs_main.c around
line number 1200:
/* We must release the mutex before going ahead and calling
 * disassociate_ucontext. disassociate_ucontext might end up
 * indirectly calling uverbs_close, for example due to freeing
 * the resources (e.g mmput).
 */

>
> Noting that ib_uverbs_free_hw_resources holds lists_mutex while
> calling ib_uverbs_cleanup_ucontext, so it should be safe, or we have
> another bug?

No, ib_uverbs_cleanup_ucontext is called outside mutex from this context.
the code takes the reference of the file pointer from the list, then
releases the mutex
then calls ib_uverbs_cleanup_ucontext. After the call is done, mutext
is acquired again.

>
> Certainly, the above is closer to the original intent of how this was
> supposed to work...
>
> Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux