Re: [Linaro-mm-sig] [PATCH] drm/msm: Add fence->wait() op

Rob Clark <robdclark@xxxxxxxxx> · Thu, 22 Jul 2021 08:46:41 -0700

On Thu, Jul 22, 2021 at 2:28 AM Christian König
<ckoenig.leichtzumerken@xxxxxxxxx> wrote:
>
> Am 22.07.21 um 11:08 schrieb Daniel Vetter:
> > [SNIP]
> >> As far as I know wake_up_state() tries to run the thread on the CPU it was
> >> scheduled last, while wait_event_* makes the thread run on the CPU who
> >> issues the wake by default.
> >>
> >> And yes I've also noticed this already and it was one of the reason why I
> >> suggested to use a wait_queue instead of the hand wired dma_fence_wait
> >> implementation.
> > The first versions had used wait_queue, but iirc we had some issues with
> > the callbacks and stuff and that was the reasons for hand-rolling. Or
> > maybe it was the integration of the lockless fastpath for
> > dma_fence_is_signalled().
> >
> >> [SNIP]
> >> Well it would have been nicer if we used the existing infrastructure instead
> >> of re-inventing stuff for dma_fence, but that chance is long gone.
> >>
> >> And you don't need a dma_fence_context base class, but rather just a flag in
> >> the dma_fence_ops if you want to change the behavior.
> > If there's something broken we should just fix it, not force everyone to
> > set a random flag. dma_fence work like special wait_queues, so if we
> > differ then we should go back to that.
>
> Wait a second with that, this is not broken. It's just different
> behavior and there are good arguments for both sides.
>
> If a wait is short you can have situations where you want to start the
> thread on the original CPU.
>      This is because you can assume that the caches on that CPU are
> still hot and heating up the caches on the local CPU would take longer
> than an inter CPU interrupt.
>
> But if the wait is long it makes more sense to run the thread on the CPU
> where you noticed the wake up event.
>      This is because you can assume that the caches are cold anyway and
> starting the thread on the current CPU (most likely from an interrupt
> handler) gives you the absolutely best latency.
>      In other words you usually return from the interrupt handler and
> just directly switch to the now running thread.
>
> I'm not sure if all drivers want the same behavior. Rob here seems to
> prefer number 2, but we have used 1 for dma_fence for a rather long time
> now and it could be that some people start to complain when we switch
> unconditionally.
>

Hmm, I wonder if it would make sense to have a dma_wait_fence() flag
to control the behavior, since it is maybe more about the waiter (and
perhaps how long the waiter expects to wait) than the signaler..

BR,
-R