Re: [PATCH 14/19] drm/i915: Reconstruct active state on starting busy-stats

Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> · Mon, 8 Jan 2018 14:44:27 +0000

On 08/01/2018 14:29, Chris Wilson wrote:
Quoting Tvrtko Ursulin (2018-01-08 14:18:53)

On 02/01/2018 15:12, Chris Wilson wrote:
We have a hole in our busy-stat accounting if the pmu is enabled during
a long running batch, the pmu will not start accumulating busy-time
until the next context switch. This then fails tests that are only
sampling a single batch.

Oops, something in recent perf_pmu rework uncovered this?

Yeah, the tweaks reduced some tests to submitting a single batch started
before the counters. Previously all tests started the counters then
submitted the workload. I had a test planned, obviously forgot to write
it. I couldn't think of a way to guarantee hitting the lite-restore case
thought.

Spin batch from ctx 1 and another to ctx 2, after a short delay, 
wouldn't do it? Or at least reliably?

Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
---
   drivers/gpu/drm/i915/intel_engine_cs.c | 9 ++++++++-
   1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index ebdcbcbacb3c..c2f01ff96ce1 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -1946,8 +1946,15 @@ int intel_enable_engine_stats(struct intel_engine_cs *engine)
       spin_lock_irqsave(&engine->stats.lock, flags);
       if (engine->stats.enabled == ~0)
               goto busy;
-     if (engine->stats.enabled++ == 0)
+     if (engine->stats.enabled++ == 0) {
               engine->stats.enabled_at = ktime_get();
+
+             /* XXX submission method oblivious */

You mean once busy stats support for the GuC gets added?

I was just thinking that we shouldn't be poking into execlists here and
that we need an agnostic method for execlists to tell busy_stats that it
is starting active. Or something.

Okay, yes. Would need to define what the API returns explicitly, perhaps 
a number of context submitted to the hardware. But that could be trouble 
for the GuC backend. Will think about it.

+             engine->stats.active = port_count(&engine->execlists.port[1]);
+             engine->stats.active += port_count(&engine->execlists.port[0]);

Hm, wouldn't this have the potential to over-account the active counter
resulting in permanent 100%? Port count is [0, 2], where 2 is
re-submission of port 0, while engine->stats.active is [0, 2], where
this 2 is number of ports. In other words I think the correct method
would be:

if (port_count(0))
         active++;
if (port_count(1))
         active++;

I thought we needed for stats.active to reflect the number of pending
context_out we will report. So for the lite-restore case, I thought 2
was the correct value.

Hm, so am I mistaken that we would get still only one context_out 
(port->count to zero transition) from port0 in lite-restore case? I 
can't spot it in the code at the moment that I am, but who knows.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx