On Tue, Apr 07, 2015 at 04:20:41PM +0100, Chris Wilson wrote: > This provides a nice boost to mesa in swap bound scenarios (as mesa > throttles itself to the previous frame and given the scenario that will > complete shortly). It will also provide a good boost to systems running > with semaphores disabled and so frequently waiting on the GPU as it > switches rings. In the most favourable of microbenchmarks, this can > increase performance by around 15% - though in practice improvements > will be marginal and rarely noticeable. > > v2: Account for user timeouts > v3: Limit the spinning to a single jiffie (~1us) at most. On an > otherwise idle system, there is no scheduler contention and so without a > limit we would spin until the GPU is ready. > v4: Drop forcewake - the lazy coherent access doesn't require it, and we > have no reason to believe that the forcewake itself improves seqno > coherency - it only adds delay. > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > Cc: Daniel Vetter <daniel.vetter@xxxxxxxx> > Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> > Cc: Eero Tamminen <eero.t.tamminen@xxxxxxxxx> > Cc: "Rantala, Valtteri" <valtteri.rantala@xxxxxxxxx> Eero/Valtterri, do you have perf data for this one? Thanks, Daniel > --- > drivers/gpu/drm/i915/i915_gem.c | 44 +++++++++++++++++++++++++++++++++++------ > 1 file changed, 38 insertions(+), 6 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index c7d9ee2f708a..47650327204e 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -1181,6 +1181,29 @@ static bool missed_irq(struct drm_i915_private *dev_priv, > return test_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings); > } > > +static int __i915_spin_request(struct drm_i915_gem_request *rq) > +{ > + unsigned long timeout; > + > + if (i915_gem_request_get_ring(rq)->irq_refcount) > + return -EBUSY; > + > + timeout = jiffies + 1; > + while (!need_resched()) { > + if (i915_gem_request_completed(rq, true)) > + return 0; > + > + if (time_after_eq(jiffies, timeout)) > + break; > + > + cpu_relax_lowlatency(); > + } > + if (i915_gem_request_completed(rq, false)) > + return 0; > + > + return -EAGAIN; > +} > + > /** > * __i915_wait_request - wait until execution of request has finished > * @req: duh! > @@ -1225,12 +1248,20 @@ int __i915_wait_request(struct drm_i915_gem_request *req, > if (INTEL_INFO(dev)->gen >= 6) > gen6_rps_boost(dev_priv, file_priv); > > - if (!irq_test_in_progress && WARN_ON(!ring->irq_get(ring))) > - return -ENODEV; > - > /* Record current time in case interrupted by signal, or wedged */ > trace_i915_gem_request_wait_begin(req); > before = ktime_get_raw_ns(); > + > + /* Optimistic spin for the next jiffie before touching IRQs */ > + ret = __i915_spin_request(req); > + if (ret == 0) > + goto out; > + > + if (!irq_test_in_progress && WARN_ON(!ring->irq_get(ring))) { > + ret = -ENODEV; > + goto out; > + } > + > for (;;) { > struct timer_list timer; > > @@ -1279,14 +1310,15 @@ int __i915_wait_request(struct drm_i915_gem_request *req, > destroy_timer_on_stack(&timer); > } > } > - now = ktime_get_raw_ns(); > - trace_i915_gem_request_wait_end(req); > - > if (!irq_test_in_progress) > ring->irq_put(ring); > > finish_wait(&ring->irq_queue, &wait); > > +out: > + now = ktime_get_raw_ns(); > + trace_i915_gem_request_wait_end(req); > + > if (timeout) { > s64 tres = *timeout - (now - before); > > -- > 2.1.4 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx