On Wed, Jun 23, 2021 at 1:17 PM Christian König <ckoenig.leichtzumerken@xxxxxxxxx> wrote: > > Am 22.06.21 um 19:02 schrieb Daniel Vetter: > > On Tue, Jun 22, 2021 at 3:07 PM Christian König > > <ckoenig.leichtzumerken@xxxxxxxxx> wrote: > >> Crap, hit enter to early before adding a cover letter. > >> > >> This is the same patch as before, but as requested I'm keeping the > >> exclusive fence handling as it is for now. > >> > >> Daniel can you double check this and/or make sure that it is tested? > >> > >> I only smoke tested it and the code is so complicated that I'm not sure > >> I catched all side effects. > > So I've thought about this some more, and we actually have docs for > > how this is supposed to work: > > > > https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html#implicit-fence-poll-support > > > > Docs are pretty clear that we want both read and write for EPOLLOUT or > > well both exclusive and shared fences. So I guess back to your actual > > thing, but maybe we should get some acks from userspace people for it > > (Michel, Pekka, Simon probably usual suspects). > > Ok, sounds good to me. Going to send out a patch rebased to > drm-misc-fixes today. > > > > > The other thing I brought up and I haven't seen you reply to (maybe > > missed it) is whether we shouldn't just use dma_resv_get_fences(). We > > need to do the refcounting anyway, and this avoids us having to > > open-code this very nasty code. Finally, and imo most important, this > > is what's also used in drm_gem_fence_array_add_implicit(), which many > > drivers use to handle their implicit in-fences. So would be nice to > > have exactly matching code between that and what dma-buf poll does for > > "can I read" and "can I write". > > > > Thoughts? > > Yeah, I've seen that. Just didn't had time to answer. > > That goes into the same direction as my thinking that we need to > centralize the RCU and synchronization handling in general. > > What I don't like about the approach is that dma_resv_get_fences() needs > to allocate memory. For a lot of use cases including dma_buf_poll that > is rather overkill. > > To unify the handling I think we should use the iterator I've create the > prototype of. This way we only have an "for_write" parameter and the > iterator in return gives you all the fences you need. Yeah I think in general I agree, especially in the CS code a bunch of temporary allocations aren't great. But in dma_buf_poll I don't think it's a place where anyone cares. That's why I think we can just use that here and forget about all the trickiness. The gem helper otoh wants an iterator (without retry even, since it's holding dma-resv lock). > When you then extend that approach we could say we have an enum > describing the use case. Something like: > 1. For explicit sync, just give me all the must sync fences from memory > management. > 2. For read, give me all the writers and memory management fences. > 3. For write, give me all the readers and writers and memory management > fences. > 4. For memory management, give me everything including things like PTE > updates/TLB flushes. > > So instead of asking for some specific type of fence(s) the drivers > tells the dma_resv object about their use case and in return get the > fences they need to wait for. > > This essentially means that we move the decision what to wait for from > the drivers into the dma_resv object, which I think would be a massive > improvement. > > Functions like dma_resv_get_list(), dma_resv_get_excl(), > dma_resv_get_excl_rcu() etc would then essentially be deprecated and > instead you use dma_resv_get_fences(), dma_resv_for_each_fences(), > dma_resv_wait_timeout(), dma_resv_test_signaled() with a proper use case. > > What do you think? Yeah I think in general the direction makes sense, at least long term. I think for now it's better to go with simplest solutions first until we have everyone aligned on one set of rules, and have these rules properly documented. -Daniel > Christian. > > > -Daniel > > > >> Regards, > >> Christian. > >> > >> Am 22.06.21 um 15:04 schrieb Christian König: > >>> Daniel pointed me towards this function and there are multiple obvious problems > >>> in the implementation. > >>> > >>> First of all the retry loop is not working as intended. In general the retry > >>> makes only sense if you grab the reference first and then check the sequence > >>> values. > >>> > >>> It's also good practice to keep the reference around when installing callbacks > >>> to fences you don't own. > >>> > >>> And last the whole implementation was unnecessary complex and rather hard to > >>> understand which could lead to probably unexpected behavior of the IOCTL. > >>> > >>> Fix all this by reworking the implementation from scratch. > >>> > >>> Only mildly tested and needs a thoughtful review of the code. > >>> > >>> v2: fix the reference counting as well > >>> v3: keep the excl fence handling as is for stable > >>> > >>> Signed-off-by: Christian König <christian.koenig@xxxxxxx> > >>> CC: stable@xxxxxxxxxxxxxxx > >>> --- > >>> drivers/dma-buf/dma-buf.c | 133 ++++++++++++++++---------------------- > >>> include/linux/dma-buf.h | 2 +- > >>> 2 files changed, 55 insertions(+), 80 deletions(-) > >>> > >>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c > >>> index eadd1eaa2fb5..e97c3a9d98d5 100644 > >>> --- a/drivers/dma-buf/dma-buf.c > >>> +++ b/drivers/dma-buf/dma-buf.c > >>> @@ -72,7 +72,7 @@ static void dma_buf_release(struct dentry *dentry) > >>> * If you hit this BUG() it means someone dropped their ref to the > >>> * dma-buf while still having pending operation to the buffer. > >>> */ > >>> - BUG_ON(dmabuf->cb_shared.active || dmabuf->cb_excl.active); > >>> + BUG_ON(dmabuf->cb_in.active || dmabuf->cb_out.active); > >>> > >>> dmabuf->ops->release(dmabuf); > >>> > >>> @@ -202,16 +202,20 @@ static void dma_buf_poll_cb(struct dma_fence *fence, struct dma_fence_cb *cb) > >>> wake_up_locked_poll(dcb->poll, dcb->active); > >>> dcb->active = 0; > >>> spin_unlock_irqrestore(&dcb->poll->lock, flags); > >>> + dma_fence_put(fence); > >>> } > >>> > >>> static __poll_t dma_buf_poll(struct file *file, poll_table *poll) > >>> { > >>> + struct dma_buf_poll_cb_t *dcb; > >>> struct dma_buf *dmabuf; > >>> struct dma_resv *resv; > >>> struct dma_resv_list *fobj; > >>> struct dma_fence *fence_excl; > >>> - __poll_t events; > >>> unsigned shared_count, seq; > >>> + struct dma_fence *fence; > >>> + __poll_t events; > >>> + int r, i; > >>> > >>> dmabuf = file->private_data; > >>> if (!dmabuf || !dmabuf->resv) > >>> @@ -225,99 +229,70 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll) > >>> if (!events) > >>> return 0; > >>> > >>> + dcb = events & EPOLLOUT ? &dmabuf->cb_out : &dmabuf->cb_in; > >>> + > >>> + /* Only queue a new one if we are not still waiting for the old one */ > >>> + spin_lock_irq(&dmabuf->poll.lock); > >>> + if (dcb->active) > >>> + events = 0; > >>> + else > >>> + dcb->active = events; > >>> + spin_unlock_irq(&dmabuf->poll.lock); > >>> + if (!events) > >>> + return 0; > >>> + > >>> retry: > >>> seq = read_seqcount_begin(&resv->seq); > >>> rcu_read_lock(); > >>> > >>> fobj = rcu_dereference(resv->fence); > >>> - if (fobj) > >>> + if (fobj && events & EPOLLOUT) > >>> shared_count = fobj->shared_count; > >>> else > >>> shared_count = 0; > >>> - fence_excl = rcu_dereference(resv->fence_excl); > >>> - if (read_seqcount_retry(&resv->seq, seq)) { > >>> - rcu_read_unlock(); > >>> - goto retry; > >>> - } > >>> > >>> - if (fence_excl && (!(events & EPOLLOUT) || shared_count == 0)) { > >>> - struct dma_buf_poll_cb_t *dcb = &dmabuf->cb_excl; > >>> - __poll_t pevents = EPOLLIN; > >>> - > >>> - if (shared_count == 0) > >>> - pevents |= EPOLLOUT; > >>> - > >>> - spin_lock_irq(&dmabuf->poll.lock); > >>> - if (dcb->active) { > >>> - dcb->active |= pevents; > >>> - events &= ~pevents; > >>> - } else > >>> - dcb->active = pevents; > >>> - spin_unlock_irq(&dmabuf->poll.lock); > >>> - > >>> - if (events & pevents) { > >>> - if (!dma_fence_get_rcu(fence_excl)) { > >>> - /* force a recheck */ > >>> - events &= ~pevents; > >>> - dma_buf_poll_cb(NULL, &dcb->cb); > >>> - } else if (!dma_fence_add_callback(fence_excl, &dcb->cb, > >>> - dma_buf_poll_cb)) { > >>> - events &= ~pevents; > >>> - dma_fence_put(fence_excl); > >>> - } else { > >>> - /* > >>> - * No callback queued, wake up any additional > >>> - * waiters. > >>> - */ > >>> - dma_fence_put(fence_excl); > >>> - dma_buf_poll_cb(NULL, &dcb->cb); > >>> - } > >>> + for (i = 0; i < shared_count; ++i) { > >>> + fence = rcu_dereference(fobj->shared[i]); > >>> + fence = dma_fence_get_rcu(fence); > >>> + if (!fence || read_seqcount_retry(&resv->seq, seq)) { > >>> + /* Concurrent modify detected, force re-check */ > >>> + dma_fence_put(fence); > >>> + rcu_read_unlock(); > >>> + goto retry; > >>> } > >>> - } > >>> > >>> - if ((events & EPOLLOUT) && shared_count > 0) { > >>> - struct dma_buf_poll_cb_t *dcb = &dmabuf->cb_shared; > >>> - int i; > >>> - > >>> - /* Only queue a new callback if no event has fired yet */ > >>> - spin_lock_irq(&dmabuf->poll.lock); > >>> - if (dcb->active) > >>> - events &= ~EPOLLOUT; > >>> - else > >>> - dcb->active = EPOLLOUT; > >>> - spin_unlock_irq(&dmabuf->poll.lock); > >>> - > >>> - if (!(events & EPOLLOUT)) > >>> + r = dma_fence_add_callback(fence, &dcb->cb, dma_buf_poll_cb); > >>> + if (!r) { > >>> + /* Callback queued */ > >>> + events = 0; > >>> goto out; > >>> + } > >>> + dma_fence_put(fence); > >>> + } > >>> > >>> - for (i = 0; i < shared_count; ++i) { > >>> - struct dma_fence *fence = rcu_dereference(fobj->shared[i]); > >>> - > >>> - if (!dma_fence_get_rcu(fence)) { > >>> - /* > >>> - * fence refcount dropped to zero, this means > >>> - * that fobj has been freed > >>> - * > >>> - * call dma_buf_poll_cb and force a recheck! > >>> - */ > >>> - events &= ~EPOLLOUT; > >>> - dma_buf_poll_cb(NULL, &dcb->cb); > >>> - break; > >>> - } > >>> - if (!dma_fence_add_callback(fence, &dcb->cb, > >>> - dma_buf_poll_cb)) { > >>> - dma_fence_put(fence); > >>> - events &= ~EPOLLOUT; > >>> - break; > >>> - } > >>> + fence = dma_resv_get_excl(resv); > >>> + if (fence && shared_count == 0) { > >>> + fence = dma_fence_get_rcu(fence); > >>> + if (!fence || read_seqcount_retry(&resv->seq, seq)) { > >>> + /* Concurrent modify detected, force re-check */ > >>> dma_fence_put(fence); > >>> + rcu_read_unlock(); > >>> + goto retry; > >>> + > >>> } > >>> > >>> - /* No callback queued, wake up any additional waiters. */ > >>> - if (i == shared_count) > >>> - dma_buf_poll_cb(NULL, &dcb->cb); > >>> + r = dma_fence_add_callback(fence, &dcb->cb, dma_buf_poll_cb); > >>> + if (!r) { > >>> + /* Callback queued */ > >>> + events = 0; > >>> + goto out; > >>> + } > >>> + dma_fence_put(fence_excl); > >>> } > >>> > >>> + /* No callback queued, wake up any additional waiters. */ > >>> + dma_buf_poll_cb(NULL, &dcb->cb); > >>> + > >>> out: > >>> rcu_read_unlock(); > >>> return events; > >>> @@ -562,8 +537,8 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info) > >>> dmabuf->owner = exp_info->owner; > >>> spin_lock_init(&dmabuf->name_lock); > >>> init_waitqueue_head(&dmabuf->poll); > >>> - dmabuf->cb_excl.poll = dmabuf->cb_shared.poll = &dmabuf->poll; > >>> - dmabuf->cb_excl.active = dmabuf->cb_shared.active = 0; > >>> + dmabuf->cb_in.poll = dmabuf->cb_out.poll = &dmabuf->poll; > >>> + dmabuf->cb_in.active = dmabuf->cb_out.active = 0; > >>> > >>> if (!resv) { > >>> resv = (struct dma_resv *)&dmabuf[1]; > >>> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h > >>> index efdc56b9d95f..7e747ad54c81 100644 > >>> --- a/include/linux/dma-buf.h > >>> +++ b/include/linux/dma-buf.h > >>> @@ -329,7 +329,7 @@ struct dma_buf { > >>> wait_queue_head_t *poll; > >>> > >>> __poll_t active; > >>> - } cb_excl, cb_shared; > >>> + } cb_in, cb_out; > >>> }; > >>> > >>> /** > > > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch