On 8/22/2018 2:47 PM, Jason Gunthorpe wrote: > On Wed, Aug 22, 2018 at 02:03:47PM -0500, Steve Wise wrote: >> >> >> On 8/22/2018 1:49 PM, Jason Gunthorpe wrote: >>> On Tue, Aug 21, 2018 at 04:01:09PM -0500, Steve Wise wrote: >>>> Hey RDMAers, >>>> >>>> I see that a uverbs completion event is added to the ev_queue event_list >>>> as well as the associated uobject comp_list in ib_uverbs_comp_handler(): >>>> >>>> list_add_tail(&entry->list, &ev_queue->event_list); >>>> list_add_tail(&entry->obj_list, &uobj->comp_list); >>>> spin_unlock_irqrestore(&ev_queue->lock, flags); >>>> >>>> >>>> But in ib_uverbs_comp_event_close(), it looks like the entry could be >>>> left on the uobj comp_list and then the event structure is freed! >>> >>> I don't think we can get entries on the ev_queue for the >>> uverbs_event_fops FD that do not have counter set. >>> >>> It should be changed, it is confusing, but the change should be to >>> delete this: >>> >>> @@ -436,8 +436,7 @@ static int ib_uverbs_comp_event_close(struct inode *inode, struct file *filp) >>> >>> spin_lock_irq(&file->ev_queue.lock); >>> list_for_each_entry_safe(entry, tmp, &file->ev_queue.event_list, list) { >>> - if (entry->counter) >>> - list_del(&entry->obj_list); >>> + list_del(&entry->obj_list); >>> kfree(entry); >>> } >>> >>> The only events that should be pushed there are completion events, via >>> ib_uverbs_comp_handler and they always have non-NULL counter. >>> >>> Ie this in ib_uverbs_comp_handler: >>> >>> entry->counter = &uobj->comp_events_reported; >>> >>> Is never NULL, it is the address of a structure member. >>> >>> Jason >>> >> >> Thanks Jason, >> >> This then leaves me with the mystery of why I see "duplicate add" >> warnings from list_add_tail() called from ib_uverbs_comp_handler(). I >> wonder what other condition/state can cause this? > > Since ib_uverbs_comp_handler() uses fresh kmalloc memory there are > these cases I can see: > > 1 - The ib_uverbs_event was freed, but left on a list, then recylced by > kmalloc > 2 - The memory backing the ev_queue->event_list was freed before/during > ib_uverbs_comp_handler > 3 - ev_queue->event_list is manipulated without locking > 4 - The memory backing uobj->comp_list was freed before/during > ib_uverbs_comp_handler > 5 - uobj->comp_list is manipulated without locking > > #4 is prevented by ib_uverbs_cq_event_handler() not being called > after ib_destroy_cq(). This is very important, there is a bad bug if > this is not guaranteed > > #2 is prevented by #4, and holidng the uobject reference for the comp > channel like this. ie we cannot enter ib_uverbs_cq_event_handler() > without a guarentee that context_args is a held reference. > > static struct ib_ucq_object *create_cq(struct ib_uverbs_file *file, > [..] > ev_file = ib_uverbs_lookup_comp_file(cmd->comp_channel, file); > cq->event_handler = ib_uverbs_cq_event_handler; > cq->cq_context = ev_file ? &ev_file->xv_queue : NULL; > > Which has a matching put here: > > void ib_uverbs_release_ucq(struct ib_uverbs_file *file, > struct ib_uverbs_completion_event_file *ev_file, > struct ib_ucq_object *uobj) > [..] > uverbs_uobject_put(&ev_file->uobj); > > #5 is easy to check, the locking looks fine. > > That leaves 1 and 3 to check.. > > I see there are calls to comp_handler in the drivers. My best guess is > some driver has messed this up and is calling comp_handler after the > cq is destroyed? Maybe. The path that tickles this [1] is in iw_cxgb4. But the QP is being moved out of RTS, so the CQs should still be allocated. > > Are you running with slab poisoning or kasan on? I'll enable these. Thank you for the detailed analysis! Steve. [1] - iw_cxgb4 received a peer close request which causes a flush operation on the QP due to moving it from RTS->CLOSING. Any WRs pending in the QP are completed with "FLUSHED" status. And the uvberbs comp_handler is called which hits the "double add" warning. Aug 21 09:45:02 nazca3 kernel: list_add double add: new=ffff944ff02a2410, prev=ffff944ff02a2410, next=ffff9450495777c0. Aug 21 09:45:02 nazca3 kernel: ------------[ cut here ]------------ Aug 21 09:45:02 nazca3 kernel: Call Trace: Aug 21 09:45:02 nazca3 kernel: ib_uverbs_comp_handler+0x8e/0x150 [ib_uverbs] Aug 21 09:45:02 nazca3 kernel: flush_qp+0x5a/0xb8 [iw_cxgb4] Aug 21 09:45:02 nazca3 kernel: c4iw_modify_rc_qp+0x569/0xa1b [iw_cxgb4] Aug 21 09:45:02 nazca3 kernel: peer_close+0x1f8/0x2fa [iw_cxgb4] Aug 21 09:45:02 nazca3 kernel: ? ttwu_do_wakeup+0x19/0x140 Aug 21 09:45:02 nazca3 kernel: ? try_to_wake_up+0x54/0x450 Aug 21 09:45:02 nazca3 kernel: process_work+0x2d/0x51 [iw_cxgb4] Aug 21 09:45:02 nazca3 kernel: process_one_work+0x14f/0x360 Aug 21 09:45:02 nazca3 kernel: worker_thread+0x47/0x3e0 Aug 21 09:45:02 nazca3 kernel: kthread+0xfc/0x130 Aug 21 09:45:02 nazca3 kernel: ? max_active_store+0x80/0x80 Aug 21 09:45:02 nazca3 kernel: ? __kthread_parkme+0x90/0x90 Aug 21 09:45:02 nazca3 kernel: ret_from_fork+0x35/0x40