Re: [PATCH 088/190] drm/i915: Move execlists interrupt based submission to a bottom-half

Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> · Fri, 19 Feb 2016 12:29:20 +0000

On Fri, Feb 19, 2016 at 12:08:14PM +0000, Tvrtko Ursulin wrote:
> 
> Hi,
> 
> On 11/01/16 10:44, Chris Wilson wrote:
> >[  196.988204] clocksource: timekeeping watchdog: Marking clocksource 'tsc' as unstable because the skew is too large:
> >[  196.988512] clocksource:                       'refined-jiffies' wd_now: ffff9b48 wd_last: ffff9acb mask: ffffffff
> >[  196.988559] clocksource:                       'tsc' cs_now: 4fcfa84354 cs_last: 4f95425e98 mask: ffffffffffffffff
> >[  196.992115] clocksource: Switched to clocksource refined-jiffies
> >
> >Followed by a hard lockup.
> 
> What does the above mean? Irq handler taking too long interferes
> with time keeping ?

That's exactly what it means, we run for too long in interrupt context
(i.e. with interrupts disabled).

> I like it BTW. Just the commit message needs more work. :)
> 
> How is performance impact with just this patch in isolation? Worth
> progressing it on its own?

I only looked for regressions, which I didn't find. It fixes a machine
freeze/panic, so I wasn't looking for any other reason to justify the
patch!

> >-void intel_lrc_irq_handler(struct intel_engine_cs *ring)
> >+static int intel_execlists_submit(void *arg)
> >  {
> >-	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> >-	u32 status_pointer;
> >-	u8 read_pointer;
> >-	u8 write_pointer;
> >-	u32 status = 0;
> >-	u32 status_id;
> >-	u32 submit_contexts = 0;
> >+	struct intel_engine_cs *ring = arg;
> >+	struct drm_i915_private *dev_priv = ring->i915;
> >
> >-	status_pointer = I915_READ(RING_CONTEXT_STATUS_PTR(ring));
> >+	set_rtpriority();
> >
> >-	read_pointer = ring->next_context_status_buffer;
> >-	write_pointer = GEN8_CSB_WRITE_PTR(status_pointer);
> >-	if (read_pointer > write_pointer)
> >-		write_pointer += GEN8_CSB_ENTRIES;
> >+	do {
> >+		u32 status;
> >+		u32 status_id;
> >+		u32 submit_contexts;
> >+		u8 head, tail;
> >
> >-	spin_lock(&ring->execlist_lock);
> >+		set_current_state(TASK_INTERRUPTIBLE);
> 
> Hm, what is the effect of setting TASK_INTERRUPTIBLE at this stage
> rather than just before the call to schedule?

We need to set the task state before doing the checks, so that we do not
miss a wakeup from the interrupt handler. Otherwise, we may check the
register, decide there is nothing to do, be interrupted by the irq
handler which then sets us to TASK_RUNNING, and then proceed with going
to sleep by setting TASK_INTERRUPTIBLE (and so missing that the
context-switch had just occurred and sleep forever).

> And why interruptible?

To hide ourselves from contributing to the system load and appear as
sleeping (rather than blocked) in the process lists.

> >+		while (head++ < tail) {
> >+			status = I915_READ(RING_CONTEXT_STATUS_BUF_LO(ring, head % GEN8_CSB_ENTRIES));
> >+			status_id = I915_READ(RING_CONTEXT_STATUS_BUF_HI(ring, head % GEN8_CSB_ENTRIES));
> 
> One potentially cheap improvement I've been thinking of is to move
> the CSB reading outside the execlist_lock, to avoid slow MMIO
> contending the lock.

Yes, that should be a small but noticeable improvement.

> We could fetch all valid entries into a temporary local buffer, then
> grab the execlist_lock and process completions and submission from
> there.

If you look at the next patch, you will see that's what I did do :)

> >-	ring->next_context_status_buffer = write_pointer % GEN8_CSB_ENTRIES;
> >+		spin_unlock(&ring->execlist_lock);
> 
> Would it be worth trying to grab the mutex and unpin the LRCs at
> this point? It would slow down the thread a bit but would get rid of
> the retired work queue. With the vma->iomap thingy could be quite
> cheap?

Exactly, we want the iomap/vmap caching thingy first :) But the
retired work queue disappears as a fallout of your previous-context idea
anyway plus the fix to avoid the struct_mutex when freeing requests.

> >-	/* Update the read pointer to the old write pointer. Manual ringbuffer
> >-	 * management ftw </sarcasm> */
> >-	I915_WRITE(RING_CONTEXT_STATUS_PTR(ring),
> >-		   _MASKED_FIELD(GEN8_CSB_READ_PTR_MASK,
> >-				 ring->next_context_status_buffer << 8));
> >+		WARN(submit_contexts > 2, "More than two context complete events?\n");
> >+		ring->next_context_status_buffer = tail % GEN8_CSB_ENTRIES;
> >+		I915_WRITE(RING_CONTEXT_STATUS_PTR(ring),
> >+			   _MASKED_FIELD(GEN8_CSB_PTR_MASK << 8,
> >+					 ring->next_context_status_buffer<<8));
> >+	} while (1);
> >  }
> >
> >  static int execlists_context_queue(struct drm_i915_gem_request *request)
> >@@ -585,7 +582,7 @@ static int execlists_context_queue(struct drm_i915_gem_request *request)
> >
> >  	i915_gem_request_get(request);
> >
> >-	spin_lock_irq(&engine->execlist_lock);
> >+	spin_lock(&engine->execlist_lock);
> >
> >  	list_for_each_entry(cursor, &engine->execlist_queue, execlist_link)
> >  		if (++num_elements > 2)
> >@@ -611,7 +608,7 @@ static int execlists_context_queue(struct drm_i915_gem_request *request)
> >  	if (num_elements == 0)
> >  		execlists_context_unqueue(engine);
> >
> >-	spin_unlock_irq(&engine->execlist_lock);
> >+	spin_unlock(&engine->execlist_lock);
> 
> Hm, another thing which could be quite cool is if here we could just
> wake the thread and let it submit the request instead of doing it
> directly from 3rd party context.

Yes, this is something I played around with, and my conclusion was that
the extra overhead from calling ttwu (try_to_wake_up) on the majority of
workloads outweighed the few workloads that benefitted.

> That would make all ELSP accesses serialized for free since they
> would only be happening from a single thread. And potentially could
> reduce the scope of the lock even more.
> 
> But it would mean extra latency when transitioning the idle engine
> to busy. Maybe it wouldn't matter for such workloads.

Yup. I saw greater improvement from reducing the overhead along the
execlists context-switch handling path than the adding of requests.

There is certainly plenty of scope for improvement though.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx