Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: > Quoting Mika Kuoppala (2018-06-27 14:08:34) >> Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: >> >> > By taking advantage of the RCU protection of the task struct, we can find >> > the appropriate signaler under the spinlock and then release the spinlock >> > before waking the task and signaling the fence. >> > >> > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> >> > --- >> > drivers/gpu/drm/i915/i915_irq.c | 33 ++++++++++++++++++++++----------- >> > 1 file changed, 22 insertions(+), 11 deletions(-) >> > >> > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c >> > index 316d0b08d40f..53dad48f92ce 100644 >> > --- a/drivers/gpu/drm/i915/i915_irq.c >> > +++ b/drivers/gpu/drm/i915/i915_irq.c >> > @@ -1145,21 +1145,23 @@ static void ironlake_rps_change_irq_handler(struct drm_i915_private *dev_priv) >> > >> > static void notify_ring(struct intel_engine_cs *engine) >> > { >> > + const u32 seqno = intel_engine_get_seqno(engine); >> > struct i915_request *rq = NULL; >> > + struct task_struct *tsk = NULL; >> > struct intel_wait *wait; >> > >> > - if (!engine->breadcrumbs.irq_armed) >> > + if (unlikely(!engine->breadcrumbs.irq_armed)) >> > return; >> > >> >> Ok, so due to unlikeliness, you get the seqno early. >> >> > atomic_inc(&engine->irq_count); >> > - set_bit(ENGINE_IRQ_BREADCRUMB, &engine->irq_posted); >> > + >> > + rcu_read_lock(); >> >> As I understand from irc discussion, we have our own const >> or stable copy of task struct from now on. >> > >> > spin_lock(&engine->breadcrumbs.irq_lock); >> > wait = engine->breadcrumbs.irq_wait; >> > if (wait) { >> > - bool wakeup = engine->irq_seqno_barrier; >> > - >> > - /* We use a callback from the dma-fence to submit >> > + /* >> > + * We use a callback from the dma-fence to submit >> > * requests after waiting on our own requests. To >> > * ensure minimum delay in queuing the next request to >> > * hardware, signal the fence now rather than wait for >> > @@ -1170,19 +1172,23 @@ static void notify_ring(struct intel_engine_cs *engine) >> > * and to handle coalescing of multiple seqno updates >> > * and many waiters. >> > */ >> > - if (i915_seqno_passed(intel_engine_get_seqno(engine), >> > - wait->seqno)) { >> > + if (i915_seqno_passed(seqno, wait->seqno)) { >> > struct i915_request *waiter = wait->request; >> > >> > - wakeup = true; >> > if (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, >> > &waiter->fence.flags) && >> > intel_wait_check_request(wait, waiter)) >> > rq = i915_request_get(waiter); >> > - } >> > >> > - if (wakeup) >> > - wake_up_process(wait->tsk); >> > + tsk = wait->tsk; >> > + } else { >> > + if (engine->irq_seqno_barrier && >> > + i915_seqno_passed(seqno, wait->seqno - 1)) { >> > + set_bit(ENGINE_IRQ_BREADCRUMB, >> > + &engine->irq_posted); >> > + tsk = wait->tsk; >> >> Hmm, you are optimistic that the latency of wakeup will be on par >> or greater than the next request completion? >> >> And wait side notices too that we are close and spins, >> instead of going back to sleep? > > Don't forget this is the missed breadcrumb mitigation for gen5-gen7, and > only for it. The intent is to keep that away from the paths that do not > need the extra delays. > Looks sensible optimization. But I would strip it out from this patch to it's own, to keep the content in sync with the commit message. -Mika _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx