On Tue, May 07, 2019 at 03:38:21PM +0300, Gal Pressman wrote: > On 06-May-19 21:31, Jason Gunthorpe wrote: > > On Mon, May 06, 2019 at 04:51:00PM +0300, Gal Pressman wrote: > >>>>>>>> +static void efa_com_admin_flush(struct efa_com_dev *edev) > >>>>>>>> +{ > >>>>>>>> + struct efa_com_admin_queue *aq = &edev->aq; > >>>>>>>> + > >>>>>>>> + clear_bit(EFA_AQ_STATE_RUNNING_BIT, &aq->state); > >>>>>>> > >>>>>>> This scheme looks use after free racey to me.. > >>>>>> > >>>>>> The running bit stops new admin commands from being submitted, clearly the exact > >>>>>> moment in which the bit is cleared is "racy" to submission of admin commands but > >>>>>> that is taken care of. > >>>>>> > >>>>>> The submission of an admin command is atomic as it is guarded by an admin queue > >>>>>> lock. > >>>>>> The same lock is acquired by this flow as well when flushing the admin queue. > >>>>>> After all admin commands have been aborted and we know for sure that > >>>>>> no new > >>>>> > >>>>> The problem is the 'abort' does nothing to ensure parallel threads are > >>>>> no longer touching this memory, > >>>> > >>>> Which memory? The user threads touch the user allocated buffers which are not > >>>> being freed on admin queue destroy. > >>> > >>> The memory the other thread is touching is freed a few lines below in > >>> a devm_kfree. The apparent purpose of this code is to make the other > >>> thread stop but does it wrong. > >> > >> Are we talking about the completion context/completion context pool? > >> The user thread does use this memory, but this is done while the avail_cmds > >> semaphore is down which means the wait_for_abort_completion function is still > >> waiting for this thread to finish. > > > > We are talking about > > > > CPU 0 CPU 1 > > efa_com_submit_admin_cmd() > > spin_lock(&aq->sq.lock); > > > > efa_remove_device() > > efa_com_admin_destroy() > > efa_com_admin_flush() > > [..] > > kfree(aq) > > > > > > As long as efa_com_submit_admin_cmd() is running the semaphore is still "down" > which means the wait function will be blocked. It is a race, order it a little differently: CPU 0 CPU 1 efa_com_submit_admin_cmd() efa_remove_device() efa_com_admin_destroy() efa_com_admin_flush() efa_com_wait_for_abort_completion() [..] spin_lock(&aq->sq.lock); kfree(aq) Fundamentally you can't use locking *inside* the memory you are trying to free to exclude other threads from using that memory. That is always a user after free. Which is why when I see someone write something like: spin_lock(&aq->sq.lock); if (!test_bit(EFA_AQ_STATE_RUNNING_BIT, &aq->state)) { ibdev_err(aq->efa_dev, "Admin queue is closed\n"); spin_unlock(&aq->sq.lock); it is almost always a bug And when you see matching things like: [..] set_bit(EFA_AQ_STATE_POLLING_BIT, &edev->aq.state); kfree(edev) You know it is screwed up in some way. > > So, either there is no possible concurrency with the 'aq' users and > > device removal, in which case all the convoluted locking in > > efa_com_admin_flush() and related is unneeded > > > > Or there is concurrency and it isn't being torn down properly, so we > > get the above race. > > > > My read is that all the 'admin commands' are done off of verbs > > callbacks and ib_unregister_device is called before we get to > > efa_remove_device (guaranteeing there are no running verbs callbacks), > > so there is no possible concurrency and all this efa_com_admin_flush() > > and related is pointless obfuscation. Delete it. > > You're right, the "abort" flow was overcautious as there shouldn't be any > pending threads after ib_unregister_device. > I will remove this flow. Send a follow up patch Jason