On 9/12/2018 4:33 PM, Jason Gunthorpe wrote: > On Tue, Sep 11, 2018 at 05:18:32PM -0500, Steve Wise wrote: >> >> >> On 9/11/2018 4:25 PM, Jason Gunthorpe wrote: >>> On Tue, Sep 11, 2018 at 04:20:50PM -0500, Steve Wise wrote: >>>> Ignore the old date... >>>> >>>> Jason, is this good to go? >>> >>> I'm not sure, I think your old patch might have been the right >>> one.. But I haven't checked it all through again yet. >>> >>> But I'm still not sure if the kref should be allowed to work like >>> this at all.. >>> >>> Leaving a long term dangling filep in the object pointer is not how >>> this was really intended to work. >>> >>> Jason >>> >> >> >> Looking at the original commit (1e7710f3f656), >> ib_uverbs_async_event_close() sets is_closed, yet >> ib_uverbs_comp_event_close() does not. Seems like an oversight in that >> commit maybe? But as for the whole design of this part of the rdma >> core, I'm not to familiar with how it "should work" to comment further. >> I just saw the evidence in the dump and tracked down the issue. :) > > Hmm... > > Okay, the other reason this is so confusing is because the is_closed > in uverbs_hot_unplug_completion_event_file() is not doing what it > needs to do .. > > With the below it starts to make more sense and has sensible > locking.. Once the HW disassociates is_closed becomes set and poll > indicates readable and read returns EIO, which is the standard > semantic for this sort of case. > > So, yes, this is the right version of this, we should be keeping > the is_closed in uverbs_hot_unplug_completion_event_file().. > > diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c > index 50152c1b100452..4a9d84038762fb 100644 > --- a/drivers/infiniband/core/uverbs_main.c > +++ b/drivers/infiniband/core/uverbs_main.c > @@ -281,25 +281,20 @@ static ssize_t ib_uverbs_event_read(struct ib_uverbs_event_queue *ev_queue, > spin_lock_irq(&ev_queue->lock); > > while (list_empty(&ev_queue->event_list)) { > + bool closed = ev_queue->is_closed; > + > spin_unlock_irq(&ev_queue->lock); > + if (closed) > + return -EIO; Shouldn't reading ev_queue->is_closed be done inside the lock? What if it is set immediately after reading it and before this code acquires the lock? > > if (filp->f_flags & O_NONBLOCK) > return -EAGAIN; > > if (wait_event_interruptible(ev_queue->poll_wait, > (!list_empty(&ev_queue->event_list) || > - /* The barriers built into wait_event_interruptible() > - * and wake_up() guarentee this will see the null set > - * without using RCU > - */ > - !uverbs_file->device->ib_dev))) > + ev_queue->is_closed))) > return -ERESTARTSYS; > > - /* If device was disassociated and no event exists set an error */ > - if (list_empty(&ev_queue->event_list) && > - !uverbs_file->device->ib_dev) > - return -EIO; > - > spin_lock_irq(&ev_queue->lock); > } > > @@ -361,7 +356,7 @@ static __poll_t ib_uverbs_event_poll(struct ib_uverbs_event_queue *ev_queue, > poll_wait(filp, &ev_queue->poll_wait, wait); > > spin_lock_irq(&ev_queue->lock); > - if (!list_empty(&ev_queue->event_list)) > + if (!list_empty(&ev_queue->event_list) || ev_queue->is_closed) > pollflags = EPOLLIN | EPOLLRDNORM; > spin_unlock_irq(&ev_queue->lock); > >