On Tue, Aug 28, 2018 at 02:52:41PM -0500, Steve Wise wrote: > 1) iw_cxgb4:flush_qp() must enforce that the c4iw_qp.wq is only flushed > once, upon entering ERROR state. Currently it doesn't do this for user > mode queues. This can cause a touch-after-free crash. > > 2) the "double add" list_add_tail() warning still remains with the fix > for #1. And I found out why. I forced a BUG() when the double add > condition happens. > > One CPU is here spinning trying to acquire the event queue spinlock: > > #5 [ffffaa0fe6aafd50] native_queued_spin_lock_slowpath at ffffffff8aeda512 > #6 [ffffaa0fe6aafd50] queued_spin_lock_slowpath at ffffffff8aeda1a0 > #7 [ffffaa0fe6aafd58] uverbs_hot_unplug_completion_event_file at > ffffffffc0a5743b [ib_uverbs] > #8 [ffffaa0fe6aafd78] remove_commit_fd_uobject at ffffffffc0a56597 > [ib_uverbs] > #9 [ffffaa0fe6aafd98] _rdma_remove_commit_uobject at ffffffffc0a567cf > [ib_uverbs] > #10 [ffffaa0fe6aafdc0] uverbs_close_fd at ffffffffc0a56e7f [ib_uverbs] > #11 [ffffaa0fe6aafdf0] ib_uverbs_comp_event_close at ffffffffc0a4daef > [ib_uverbs] > #12 [ffffaa0fe6aafe20] __fput at ffffffff8b043f05 > #13 [ffffaa0fe6aafe60] task_work_run at ffffffff8aea8bf2 > #14 [ffffaa0fe6aafe90] do_exit at ffffffff8ae8f7cd > #15 [ffffaa0fe6aaff00] do_group_exit at ffffffff8ae900b9 > #16 [ffffaa0fe6aaff30] do_syscall_64 at ffffffff8ae0381e > #17 [ffffaa0fe6aaff50] entry_SYSCALL_64_after_hwframe at ffffffff8b600081 > > In ib_uverbs_comp_event_close() however, all the ib_uverb_event entries > were removed from the event queue event list here: > > spin_lock_irq(&file->ev_queue.lock); > list_for_each_entry_safe(entry, tmp, &file->ev_queue.event_list, > list) { > if (entry->counter) > list_del(&entry->obj_list); > kfree(entry); > } > spin_unlock_irq(&file->ev_queue.lock); > > Notice that each entry is removed from the uobj_list but it is NOT > removed from the event list, but rather it is just kfree()d. > > So immediately after this happens (when the spin lock is released), > another CPU tries to insert a new event because the still allocated qp > is moving to ERROR state as part of a connection abort by the peer: This is not permitted. When ib_uverbs_comp_event_close() is called there can be no other concurrent user of the uobject. If there is, it signals that someone is chasing a stale pointer or is not doing uobject put/get reference counting properly. > But looking at uverbs_hot_unplug_completion_event_file(), I see it > acquires the event_queue spinlock just to set ev_queue->is_closed = 1. This is marking the comp channel as 'disassociated' but not closed, so it can't be moved.. > Perhaps setting is_closed to 1 should have been done in > ib_uverbs_comp_event_close()? No, nothing can use the comp channel by the time we get to close, so any reliance on is_closed is wrong refcounting. Sounds like it is still what I said before, the cxgb4 driver is continuing to refer to a ib_qp's internal data after destroy has happened, which is strictly not allowed. Jason