Re: [PATCH 16/21] drm/i915/gt: Track signaled breadcrumbs outside of the breadcrumb spinlock

Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> · Fri, 31 Jul 2020 17:06:08 +0100





On 31/07/2020 16:21, Chris Wilson wrote:
Quoting Chris Wilson (2020-07-31 16:12:32)
Quoting Tvrtko Ursulin (2020-07-31 16:06:55)

On 30/07/2020 10:37, Chris Wilson wrote:
@@ -191,17 +188,19 @@ static void signal_irq_work(struct irq_work *work)
   {
       struct intel_breadcrumbs *b = container_of(work, typeof(*b), irq_work);
       const ktime_t timestamp = ktime_get();
+     struct llist_node *signal, *sn;
       struct intel_context *ce, *cn;
       struct list_head *pos, *next;
-     LIST_HEAD(signal);
+
+     signal = NULL;
+     if (unlikely(!llist_empty(&b->signaled_requests)))
+             signal = llist_del_all(&b->signaled_requests);
   
       spin_lock(&b->irq_lock);
   
-     if (list_empty(&b->signalers))
+     if (!signal && list_empty(&b->signalers))

The only open from previous round was on this change. If I understood
your previous reply correctly, checking this or not simply controls the
disarm point and is not related to this patch. With the check added now
we would disarm later, because even already signaled requests would keep
it armed. I would prefer this was a separate patch if you could possibly
be convinced.

I considered that since we add to the lockless list and then queue the
irq work, that is a path that is not driven by the interrupt and so
causing an issue with the idea of the interrupt shadow. Having a simple
test I thought was a positive side-effect to filter out the early
irq_work.

How about a compromise and I sell the patch with a comment:
         /*
          * Keep the irq armed until the interrupt after all listeners are gone.
          *
          * Enabling/disabling the interrupt is rather costly, roughly a couple
          * of hundred microseconds. If we are proactive and enable/disable
          * the interrupt around every request that wants a breadcrumb, we
          * quickly drown in the extra orders of magnitude of latency imposed
          * on request submission.
          *
          * So we try to be lazy, and keep the interrupts enabled until no
          * more listeners appear within a breadcrumb interrupt interval (that
          * is until a request completes that no one cares about). The
          * observation is that listeners come in batches, and will often
          * listen to a bunch of requests in succession.
          *
          * We also try to avoid raising too many interrupts, as they may
          * be generated by userspace batches and it is unfortunately rather
          * too easy to drown the CPU under a flood of GPU interrupts. Thus
          * whenever no one appears to be listening, we turn off the interrupts.
          * Fewer interrupts should conserve power -- at the very least, fewer
          * interrupt draw less ire from other users of the system and tools
          * like powertop.
	 */

It really feels like it should be a separate patch.

I am not sure at the moment how exactly the hysteresis this would apply 
would look like. The end point is driven by the requests on the signaled 
list, but that is also driven by the timing of listeners adding 
themselves versus the request completion. For instance maybe if we want 
a hysteresis we won't something more deterministic and explicit. Maybe 
tied directly to the user interrupt following no more listeners. Like 
simply disarm on the second irq work after all b->signalers have gone. I 
just can't picture the state or b->signaled_requests in relation to all 
dynamic actions.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx