Re: [PATCH v4] drm/i915: Optimistically spin for the request completion

Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> · Fri, 20 Mar 2015 16:31:29 +0000

On 03/20/2015 04:19 PM, Chris Wilson wrote:
On Fri, Mar 20, 2015 at 04:01:52PM +0000, Tvrtko Ursulin wrote:

On 03/20/2015 02:36 PM, Chris Wilson wrote:
This provides a nice boost to mesa in swap bound scenarios (as mesa
throttles itself to the previous frame and given the scenario that will
complete shortly). It will also provide a good boost to systems running
with semaphores disabled and so frequently waiting on the GPU as it
switches rings. In the most favourable of microbenchmarks, this can
increase performance by around 15% - though in practice improvements
will be marginal and rarely noticeable.

v2: Account for user timeouts
v3: Limit the spinning to a single jiffie (~1us) at most. On an
otherwise idle system, there is no scheduler contention and so without a
limit we would spin until the GPU is ready.
v4: Drop forcewake - the lazy coherent access doesn't require it, and we
have no reason to believe that the forcewake itself improves seqno
coherency - it only adds delay.

Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Cc: Daniel Vetter <daniel.vetter@xxxxxxxx>
Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx>
Cc: Eero Tamminen <eero.t.tamminen@xxxxxxxxx>
Cc: "Rantala, Valtteri" <valtteri.rantala@xxxxxxxxx>

Still against a toggle switch like a simple module parameter?

Yes. I'd much rather tackle the corner cases than ignore them.

I'll say it once more then leave it be - my point of view is that module 
param does not mean ignoring any issues. It rather means that, if 
pathological use case if found in the field, you can provide a better 
user experience and then work in parallel in coming with improvements.

Your view is probably that if there is a toggle, someone somewhere will 
put on some wiki "yeah if that happens just put this in modprobe.conf" 
and there won't even be a bug report.

It is a downside yes, but to me much better than finding a bad corner 
case and then saying "Yeah, please recompile your kernel", or downgrade 
your kernel and wait for X weeks/months until the fix propagates.

---
  drivers/gpu/drm/i915/i915_gem.c | 44 +++++++++++++++++++++++++++++++++++------
  1 file changed, 38 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 2e17a254aac1..9988e65c1440 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1181,6 +1181,29 @@ static bool missed_irq(struct drm_i915_private *dev_priv,
  	return test_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings);
  }

+static int __i915_spin_request(struct drm_i915_gem_request *rq)
+{
+	unsigned long timeout;
+
+	if (i915_gem_request_get_ring(rq)->irq_refcount)
+		return -EBUSY;

So if someone else is already waiting on this ring skip the spin and
do the sleep-wait.

That would mean earlier waiter didn't manage to spin to completion
so for subsequent ones does it make sense to try to spin? If we
assume waiters are arriving here in submission order then no, they
should proceed to sleep-wait. But if waiters are arriving here in
random order, and that is purely up to userspace I think, then I am
not sure?

They arrive pretty much in random order.

On the other hand if we allowed this "out-of-order waiters" that
would potentially be too much spinning so maybe it is better like it
is.

Also part of my premise is that it the cost of the irq (setting it up and
handling all the intermidate ones) that is the crux of the issue. Once
we have enabled the irq for one, we may as well then use it for the
herd. Also with a herd we will want to err on the side of sleeping more,
or so my intuition says.

My gut feeling is the same. Waking thundering herd sounds safer than 
spinning one.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx