Re: [PATCH 17/70] drm/i915: Optimistically spin for the request completion

Daniel Vetter <daniel@xxxxxxxx> · Wed, 8 Apr 2015 13:39:44 +0200

On Tue, Apr 07, 2015 at 04:20:41PM +0100, Chris Wilson wrote:
> This provides a nice boost to mesa in swap bound scenarios (as mesa
> throttles itself to the previous frame and given the scenario that will
> complete shortly). It will also provide a good boost to systems running
> with semaphores disabled and so frequently waiting on the GPU as it
> switches rings. In the most favourable of microbenchmarks, this can
> increase performance by around 15% - though in practice improvements
> will be marginal and rarely noticeable.
> 
> v2: Account for user timeouts
> v3: Limit the spinning to a single jiffie (~1us) at most. On an
> otherwise idle system, there is no scheduler contention and so without a
> limit we would spin until the GPU is ready.
> v4: Drop forcewake - the lazy coherent access doesn't require it, and we
> have no reason to believe that the forcewake itself improves seqno
> coherency - it only adds delay.
> 
> Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
> Cc: Daniel Vetter <daniel.vetter@xxxxxxxx>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx>
> Cc: Eero Tamminen <eero.t.tamminen@xxxxxxxxx>
> Cc: "Rantala, Valtteri" <valtteri.rantala@xxxxxxxxx>

Eero/Valtterri, do you have perf data for this one?

Thanks, Daniel

> ---
>  drivers/gpu/drm/i915/i915_gem.c | 44 +++++++++++++++++++++++++++++++++++------
>  1 file changed, 38 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index c7d9ee2f708a..47650327204e 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1181,6 +1181,29 @@ static bool missed_irq(struct drm_i915_private *dev_priv,
>  	return test_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings);
>  }
>  
> +static int __i915_spin_request(struct drm_i915_gem_request *rq)
> +{
> +	unsigned long timeout;
> +
> +	if (i915_gem_request_get_ring(rq)->irq_refcount)
> +		return -EBUSY;
> +
> +	timeout = jiffies + 1;
> +	while (!need_resched()) {
> +		if (i915_gem_request_completed(rq, true))
> +			return 0;
> +
> +		if (time_after_eq(jiffies, timeout))
> +			break;
> +
> +		cpu_relax_lowlatency();
> +	}
> +	if (i915_gem_request_completed(rq, false))
> +		return 0;
> +
> +	return -EAGAIN;
> +}
> +
>  /**
>   * __i915_wait_request - wait until execution of request has finished
>   * @req: duh!
> @@ -1225,12 +1248,20 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>  	if (INTEL_INFO(dev)->gen >= 6)
>  		gen6_rps_boost(dev_priv, file_priv);
>  
> -	if (!irq_test_in_progress && WARN_ON(!ring->irq_get(ring)))
> -		return -ENODEV;
> -
>  	/* Record current time in case interrupted by signal, or wedged */
>  	trace_i915_gem_request_wait_begin(req);
>  	before = ktime_get_raw_ns();
> +
> +	/* Optimistic spin for the next jiffie before touching IRQs */
> +	ret = __i915_spin_request(req);
> +	if (ret == 0)
> +		goto out;
> +
> +	if (!irq_test_in_progress && WARN_ON(!ring->irq_get(ring))) {
> +		ret = -ENODEV;
> +		goto out;
> +	}
> +
>  	for (;;) {
>  		struct timer_list timer;
>  
> @@ -1279,14 +1310,15 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>  			destroy_timer_on_stack(&timer);
>  		}
>  	}
> -	now = ktime_get_raw_ns();
> -	trace_i915_gem_request_wait_end(req);
> -
>  	if (!irq_test_in_progress)
>  		ring->irq_put(ring);
>  
>  	finish_wait(&ring->irq_queue, &wait);
>  
> +out:
> +	now = ktime_get_raw_ns();
> +	trace_i915_gem_request_wait_end(req);
> +
>  	if (timeout) {
>  		s64 tres = *timeout - (now - before);
>  
> -- 
> 2.1.4
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx