Re: [PATCH v2] drm/syncobj: ensure progress for syncobj queries

Boris Brezillon <boris.brezillon@xxxxxxxxxxxxx> · Wed, 6 Nov 2024 14:12:45 +0100

On Wed, 6 Nov 2024 13:44:20 +0100
Christian König <christian.koenig@xxxxxxx> wrote:

> Am 06.11.24 um 09:14 schrieb Boris Brezillon:
> > On Tue, 5 Nov 2024 09:56:22 -0800
> > Chia-I Wu<olvaffe@xxxxxxxxx>  wrote:
> >  
> >> On Mon, Nov 4, 2024 at 11:32 PM Christian König
> >> <christian.koenig@xxxxxxx>  wrote:  
> >>> Am 04.11.24 um 22:32 schrieb Chia-I Wu:
> >>>
> >>> On Tue, Oct 22, 2024 at 10:24 AM Chia-I Wu<olvaffe@xxxxxxxxx>  wrote:
> >>>
> >>> On Tue, Oct 22, 2024 at 9:53 AM Christian König
> >>> <christian.koenig@xxxxxxx>  wrote:
> >>>
> >>> Am 22.10.24 um 18:18 schrieb Chia-I Wu:
> >>>
> >>> Userspace might poll a syncobj with the query ioctl.  Call
> >>> dma_fence_enable_sw_signaling to ensure dma_fence_is_signaled returns
> >>> true in finite time.
> >>>
> >>> Wait a second, just querying the fence status is absolutely not
> >>> guaranteed to return true in finite time. That is well documented on the
> >>> dma_fence() object.
> >>>
> >>> When you want to poll on signaling from userspace you really need to
> >>> call poll or the wait IOCTL with a zero timeout. That will also return
> >>> immediately but should enable signaling while doing that.
> >>>
> >>> So just querying the status should absolutely *not* enable signaling.
> >>> That's an intentional separation.
> >>>
> >>> I think it depends on what semantics DRM_IOCTL_SYNCOBJ_QUERY should have.
> >>>
> >>>
> >>> Well that's what I pointed out. The behavior of the QUERY IOCTL is based on the behavior of the dma_fence and the later is documented to do exactly what it currently does.
> >>>
> >>> If DRM_IOCTL_SYNCOBJ_QUERY is mainly for vulkan timeline semaphores,
> >>> it is a bit heavy if userspace has to do a
> >>> DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT before a query.
> >>>
> >>>
> >>> Maybe you misunderstood me, you *only* have to call DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT and *not* _QUERY.
> >>>
> >>> The underlying dma_fence_wait_timeout() function is extra optimized so that zero timeout has only minimal overhead.
> >>>
> >>> This overhead is actually lower than _QUERY because that one actually queries the driver for the current status while _WAIT just assumes that the driver will signal the fence when ready from an interrupt.  
> >> The context here is that vkGetSemaphoreCounterValue calls QUERY to get
> >> the timeline value.  WAIT does not replace QUERY.  
> 
> Oh, that is a really good argument.
> 
> >> Taking a step back, in the binary (singled/unsignaled) case, a WAIT
> >> with zero timeout can get the up-to-date status.  But in the timeline
> >> case, there is no direct way to get the up-to-date status if QUERY
> >> must strictly be a wrapper for dma_fence_is_signaled.  It comes back
> >> to what was QUERY designed for and can we change it?  
> 
> Yeah that is a really good point. If _QUERY implements a different 
> functionality than _WAIT than the usual distinction between polling and 
> interrupt based approach can't be applied here.
> 
> >>> I filed a Mesa issue,
> >>> https://gitlab.freedesktop.org/mesa/mesa/-/issues/12094, and Faith
> >>> suggested a kernel-side fix as well.  Should we reconsider this?
> >>>
> >>>
> >>> Wait a second, you might have an even bigger misconception here. The difference between waiting and querying is usually intentional!
> >>>
> >>> This is done so that for example on mobile devices you don't need to enable device interrupts, but rather query in defined intervals.
> >>>
> >>> This is a very common design pattern and while I don't know the wording of the Vulkan timeline extension it's quite likely that this is the intended use case.  
> >> Yeah, there are Vulkan CTS tests that query timeline semaphores
> >> repeatedly for progress.  Those tests can fail because mesa translates
> >> the queries directly to the QUERY ioctl.
> >>
> >> As things are, enable_signaling is a requirement to query up-to-date
> >> status no matter the syncobj is binary or a timeline.  
> > I kinda agree with Chia-I here. What's the point of querying a timeline
> > syncobj if what we get in return is an outdated sync point? I get that
> > the overhead of enabling signalling exists, but if we stand on this
> > position, that means the QUERY ioctl is not suitable for
> > vkGetSemaphoreCounterValue() unless we first add a
> > WAIT(sync_point=0,timeout=0) to make sure signalling is enabled on all
> > fences contained by the dma_fence_chain backing the timeline syncobj.
> > And this has to be done for each vkGetSemaphoreCounterValue(), because
> > new sync points don't have signalling enabled by default.  
> 
> The common driver design I know from other operating systems is that you 
> either poll without enabling interrupts or you enable interrupts and 
> wait for the async operation to complete.

The problem is that we don't really poll if we call ::signaled() on a
dma_fence_array. The test is based on the state the container was at
creation time. The only way to poll a fence_array right now is to
enable signalling. So maybe that's the problem we should solve here:
make dma_fence_array_signaled() iterate over the fences instead of
checking ::num_pending and returning false if it's not zero (see the
diff at the end of this email).

> 
> That distinction is really important for things like mobile devices 
> where interrupts are rather power costly.

Sure, I get the importance of keeping interrupts disabled for
power-savings.

> 
> > At the very least, we should add a new DRM_SYNCOBJ_QUERY_FLAGS_ flag
> > (DRM_SYNCOBJ_QUERY_FLAGS_REFRESH_STATE?) to combine the
> > enable_signalling and query operations in a single ioctl. If we go
> > for this approach, that means mesa has to support both cases, and pick
> > the most optimal one if the kernel supports it.  
> 
> Another problem we have here is that dma_fence_enable_signaling() can 
> mean two different things, maybe the documentation is a bit confusing:
> 
> 1) Enabling interrupts so that we don't need to call the 
> ops->is_signaled() driver callback.
> 
> 2) Asking preemption fences to actually signal because memory management 
> wants to do something.

Uh, I wasn't aware of (2). If it's document somewhere, I probably missed
that part.

> 
> So when this ops->is_signaled() callback is implemented it shouldn't be 
> necessary to enable signaling to query the state.

Agree on that, see my comment about fence_array() not necessarily doing
the right thing here.

> 
> To summarize: When you call _QUERY you shouldn't get an outdated sync 
> point. When you do this then there is something wrong with the backend 
> driver.

If by backend, you mean the dma_fence_array implementation, you're probably
right.

> 
> Regards,
> Christian.


--->8---

diff --git a/drivers/dma-buf/dma-fence-array.c b/drivers/dma-buf/dma-fence-array.c
index 8a08ffde31e7..c9222a065954 100644
--- a/drivers/dma-buf/dma-fence-array.c
+++ b/drivers/dma-buf/dma-fence-array.c
@@ -104,8 +104,22 @@ static bool dma_fence_array_signaled(struct dma_fence *fence)
 {
        struct dma_fence_array *array = to_dma_fence_array(fence);
 
-       if (atomic_read(&array->num_pending) > 0)
-               return false;
+       if (atomic_read(&array->num_pending) > 0) {
+               struct dma_fence *subfence;
+               unsigned i;
+
+               /*
+                * If signaling is enabled, we don't need to iterate over
+                * fences to get the state, we can rely on num_pending > 0.
+                */
+               if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags))
+                       return false;
+
+               dma_fence_array_for_each(subfence, i, fence) {
+                       if (!dma_fence_is_signaled(subfence))
+                               return false;
+               }
+       }
 
        dma_fence_array_clear_pending_error(array);
        return true;