On Mon, Jun 06, 2016 at 04:34:27PM +0100, Tvrtko Ursulin wrote: > > On 03/06/16 17:08, Chris Wilson wrote: > >If we flag the seqno as potentially stale upon receiving an interrupt, > >we can use that information to reduce the frequency that we apply the > >heavyweight coherent seqno read (i.e. if we wake up a chain of waiters). > > > >Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > >--- > > drivers/gpu/drm/i915/i915_drv.h | 15 ++++++++++++++- > > drivers/gpu/drm/i915/i915_irq.c | 1 + > > drivers/gpu/drm/i915/intel_breadcrumbs.c | 16 ++++++++++------ > > drivers/gpu/drm/i915/intel_ringbuffer.h | 1 + > > 4 files changed, 26 insertions(+), 7 deletions(-) > > > >diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > >index 4ddb9ff319cb..a71d08199d57 100644 > >--- a/drivers/gpu/drm/i915/i915_drv.h > >+++ b/drivers/gpu/drm/i915/i915_drv.h > >@@ -3935,7 +3935,20 @@ static inline bool __i915_request_irq_complete(struct drm_i915_gem_request *req) > > * but it is easier and safer to do it every time the waiter > > * is woken. > > */ > >- if (engine->irq_seqno_barrier) { > >+ if (engine->irq_seqno_barrier && READ_ONCE(engine->irq_posted)) { > >+ /* The ordering of irq_posted versus applying the barrier > >+ * is crucial. The clearing of the current irq_posted must > >+ * be visible before we perform the barrier operation, > >+ * such that if a subsequent interrupt arrives, irq_posted > >+ * is reasserted and our task rewoken (which causes us to > >+ * do another __i915_request_irq_complete() immediately > >+ * and reapply the barrier). Conversely, if the clear > >+ * occurs after the barrier, then an interrupt that arrived > >+ * whilst we waited on the barrier would not trigger a > >+ * barrier on the next pass, and the read may not see the > >+ * seqno update. > >+ */ > >+ WRITE_ONCE(engine->irq_posted, false); > > Why is this not smp_store_mb ? We only require the ordering wrt to irq_seqno_barrier(). How about: if (engine->irq_seqno_barrier && cmpxchg_relaxed(&engine->irq_post, 1, 0)) { Less shouty? > >diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c > >index 44346de39794..0f5fe114c204 100644 > >--- a/drivers/gpu/drm/i915/intel_breadcrumbs.c > >+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c > >@@ -43,12 +43,18 @@ static void intel_breadcrumbs_fake_irq(unsigned long data) > > > > static void irq_enable(struct intel_engine_cs *engine) > > { > >+ /* Enabling the IRQ may miss the generation of the interrupt, but > >+ * we still need to force the barrier before reading the seqno, > >+ * just in case. > >+ */ > >+ engine->irq_posted = true; > > Should it be smp_store_mb here as well? No, this is written/read on the same callchain. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx