Re: [PATCH 17/21] drm/i915: Convert trace-irq to the breadcrumb waiter

Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> · Wed, 8 Jun 2016 12:24:42 +0100

On Wed, Jun 08, 2016 at 11:16:13AM +0100, Tvrtko Ursulin wrote:
> 
> On 08/06/16 10:48, Chris Wilson wrote:
> >On Tue, Jun 07, 2016 at 01:04:22PM +0100, Tvrtko Ursulin wrote:
> >>>+static int intel_breadcrumbs_signaler(void *arg)
> >>>+{
> >>>+	struct intel_engine_cs *engine = arg;
> >>>+	struct intel_breadcrumbs *b = &engine->breadcrumbs;
> >>>+	struct signal *signal;
> >>>+
> >>>+	/* Install ourselves with high priority to reduce signalling latency */
> >>>+	signaler_set_rtpriority();
> >>>+
> >>>+	do {
> >>>+		set_current_state(TASK_INTERRUPTIBLE);
> >>>+
> >>>+		/* We are either woken up by the interrupt bottom-half,
> >>>+		 * or by a client adding a new signaller. In both cases,
> >>>+		 * the GPU seqno may have advanced beyond our oldest signal.
> >>>+		 * If it has, propagate the signal, remove the waiter and
> >>>+		 * check again with the next oldest signal. Otherwise we
> >>>+		 * need to wait for a new interrupt from the GPU or for
> >>>+		 * a new client.
> >>>+		 */
> >>>+		signal = READ_ONCE(b->first_signal);
> >>>+		if (signal_complete(signal)) {
> >>>+			/* Wake up all other completed waiters and select the
> >>>+			 * next bottom-half for the next user interrupt.
> >>>+			 */
> >>>+			intel_engine_remove_wait(engine, &signal->wait);
> >>>+
> >>>+			i915_gem_request_unreference(signal->request);
> >>>+
> >>>+			/* Find the next oldest signal. Note that as we have
> >>>+			 * not been holding the lock, another client may
> >>>+			 * have installed an even older signal than the one
> >>>+			 * we just completed - so double check we are still
> >>>+			 * the oldest before picking the next one.
> >>>+			 */
> >>>+			spin_lock(&b->lock);
> >>>+			if (signal == b->first_signal)
> >>>+				b->first_signal = rb_next(&signal->node);
> >>>+			rb_erase(&signal->node, &b->signals);
> >>>+			spin_unlock(&b->lock);
> >>>+
> >>>+			kfree(signal);
> >>>+		} else {
> >>>+			if (kthread_should_stop())
> >>>+				break;
> >>>+
> >>>+			schedule();
> >>>+		}
> >>>+	} while (1);
> >>>+
> >>>+	return 0;
> >>>+}
> >>
> >>So the thread is only because it is convenient to plug it in the
> >>breadcrumbs infrastructure. Otherwise the processing above could be
> >>done from a lighter weight context as well since nothing seems to
> >>need the process context.
> >
> >No, seqno processing requires process/sleepable context. The delays we
> >incur can be >100us and not suitable for irq/softirq context.
> 
> Nothing in this patch needs it - please say in the commit why it is
> choosing the process context then.

Bottom half processing requires it. irq_seqno_barrier is not suitable
for irq/softirq context.

> And why so long delays? It looks pretty lightweight to me.
> 
> >>One alternative could perhaps be to add a waiter->wake_up vfunc and
> >>signalers could then potentially use a tasklet?
> >
> >Hmm, I did find that in order to reduce execlists latency, I had to
> >drive the tasklet processing from the signaler.
> 
> What do you mean? The existing execlists tasklet? Now would that work?

Due to how dma-fence signals, the softirq is never kicked
(spin_lock_irq doesn't handle local_bh_enable()) and so we would only
submit a new task via execlists on a reschedule. That latency added
about 30% (30s on bsw) to gem_exec_parallel.

> >>>+int intel_engine_enable_signaling(struct drm_i915_gem_request *request)
> >>>+{
> >>>+	struct intel_engine_cs *engine = request->engine;
> >>>+	struct intel_breadcrumbs *b = &engine->breadcrumbs;
> >>>+	struct rb_node *parent, **p;
> >>>+	struct signal *signal;
> >>>+	bool first, wakeup;
> >>>+
> >>>+	if (unlikely(IS_ERR(b->signaler)))
> >>>+		return PTR_ERR(b->signaler);
> >>
> >>I don't see that there is a fallback is kthread creation failed. It
> >>should just fail in intel_engine_init_breadcrumbs if that happens.
> >
> >Because it is not fatal to using the GPU, just one optional function.
> 
> But we never expect it to fail and it is not even dependent on
> anything user controllable. Just a random error which would cause
> user experience to degrade. If thread creation failed it means
> system is in such a poor shape I would just fail the driver init.

A minimally functional system is better than nothing at all.
GEM is not required for driver loading, interrupt driven dma-fences less
so.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx