Re: possible uverbs bug with comp events?

Jason Gunthorpe <jgg@xxxxxxxx> · Sat, 1 Sep 2018 14:43:18 -0600

On Tue, Aug 28, 2018 at 02:52:41PM -0500, Steve Wise wrote:

> 1) iw_cxgb4:flush_qp() must enforce that the c4iw_qp.wq is only flushed
> once, upon entering ERROR state.  Currently it doesn't do this for user
> mode queues.  This can cause a touch-after-free crash.
> 
> 2) the "double add" list_add_tail() warning still remains with the fix
> for #1.  And I found out why.  I forced a BUG() when the double add
> condition happens.
> 
> One CPU is here spinning trying to acquire the event queue spinlock:
> 
>  #5 [ffffaa0fe6aafd50] native_queued_spin_lock_slowpath at ffffffff8aeda512
>  #6 [ffffaa0fe6aafd50] queued_spin_lock_slowpath at ffffffff8aeda1a0
>  #7 [ffffaa0fe6aafd58] uverbs_hot_unplug_completion_event_file at
> ffffffffc0a5743b [ib_uverbs]
>  #8 [ffffaa0fe6aafd78] remove_commit_fd_uobject at ffffffffc0a56597
> [ib_uverbs]
>  #9 [ffffaa0fe6aafd98] _rdma_remove_commit_uobject at ffffffffc0a567cf
> [ib_uverbs]
> #10 [ffffaa0fe6aafdc0] uverbs_close_fd at ffffffffc0a56e7f [ib_uverbs]
> #11 [ffffaa0fe6aafdf0] ib_uverbs_comp_event_close at ffffffffc0a4daef
> [ib_uverbs]
> #12 [ffffaa0fe6aafe20] __fput at ffffffff8b043f05
> #13 [ffffaa0fe6aafe60] task_work_run at ffffffff8aea8bf2
> #14 [ffffaa0fe6aafe90] do_exit at ffffffff8ae8f7cd
> #15 [ffffaa0fe6aaff00] do_group_exit at ffffffff8ae900b9
> #16 [ffffaa0fe6aaff30] do_syscall_64 at ffffffff8ae0381e
> #17 [ffffaa0fe6aaff50] entry_SYSCALL_64_after_hwframe at ffffffff8b600081
> 
> In ib_uverbs_comp_event_close() however, all the ib_uverb_event entries
> were removed from the event queue event list here:
> 
>         spin_lock_irq(&file->ev_queue.lock);
>         list_for_each_entry_safe(entry, tmp, &file->ev_queue.event_list,
> list) {
>                 if (entry->counter)
>                         list_del(&entry->obj_list);
>                 kfree(entry);
>         }
>         spin_unlock_irq(&file->ev_queue.lock);
> 
> Notice that each entry is removed from the uobj_list but it is NOT
> removed from the event list, but rather it is just kfree()d.
> 
> So immediately after this happens (when the spin lock is released),
> another CPU tries to insert a new event because the still allocated qp
> is moving to ERROR state as part of a connection abort by the peer:

This is not permitted. 

When ib_uverbs_comp_event_close() is called there can be no other
concurrent user of the uobject.

If there is, it signals that someone is chasing a stale pointer or is
not doing uobject put/get reference counting properly.

> But looking at uverbs_hot_unplug_completion_event_file(), I see it
> acquires the event_queue spinlock just to set ev_queue->is_closed = 1.

This is marking the comp channel as 'disassociated' but not closed, so
it can't be moved..

> Perhaps setting is_closed to 1 should have been done in
> ib_uverbs_comp_event_close()?

No, nothing can use the comp channel by the time we get to close, so
any reliance on is_closed is wrong refcounting.

Sounds like it is still what I said before, the cxgb4 driver is
continuing to refer to a ib_qp's internal data after destroy has
happened, which is strictly not allowed.

Jason