Quoting Tvrtko Ursulin (2018-01-08 14:44:27) > > On 08/01/2018 14:29, Chris Wilson wrote: > > Quoting Tvrtko Ursulin (2018-01-08 14:18:53) > >> > >> On 02/01/2018 15:12, Chris Wilson wrote: > >>> We have a hole in our busy-stat accounting if the pmu is enabled during > >>> a long running batch, the pmu will not start accumulating busy-time > >>> until the next context switch. This then fails tests that are only > >>> sampling a single batch. > >> > >> Oops, something in recent perf_pmu rework uncovered this? > > > > Yeah, the tweaks reduced some tests to submitting a single batch started > > before the counters. Previously all tests started the counters then > > submitted the workload. I had a test planned, obviously forgot to write > > it. I couldn't think of a way to guarantee hitting the lite-restore case > > thought. > > Spin batch from ctx 1 and another to ctx 2, after a short delay, > wouldn't do it? Or at least reliably? You need: spin batch ctx1 short batch ctx2 spin batch ctx2 Terminate spin_batch_ctx1 and then you have a very small window where we get the CS interrupt from ctx1 and we resubmit ctx2 with both short+spin batches, up until we then get the lite-restore CS interrupt. That's the window you have to hit, a few microseconds. > > >>> Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > >>> Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > >>> --- > >>> drivers/gpu/drm/i915/intel_engine_cs.c | 9 ++++++++- > >>> 1 file changed, 8 insertions(+), 1 deletion(-) > >>> > >>> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c > >>> index ebdcbcbacb3c..c2f01ff96ce1 100644 > >>> --- a/drivers/gpu/drm/i915/intel_engine_cs.c > >>> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c > >>> @@ -1946,8 +1946,15 @@ int intel_enable_engine_stats(struct intel_engine_cs *engine) > >>> spin_lock_irqsave(&engine->stats.lock, flags); > >>> if (engine->stats.enabled == ~0) > >>> goto busy; > >>> - if (engine->stats.enabled++ == 0) > >>> + if (engine->stats.enabled++ == 0) { > >>> engine->stats.enabled_at = ktime_get(); > >>> + > >>> + /* XXX submission method oblivious */ > >> > >> You mean once busy stats support for the GuC gets added? > > > > I was just thinking that we shouldn't be poking into execlists here and > > that we need an agnostic method for execlists to tell busy_stats that it > > is starting active. Or something. > > Okay, yes. Would need to define what the API returns explicitly, perhaps > a number of context submitted to the hardware. But that could be trouble > for the GuC backend. Will think about it. > > >>> + engine->stats.active = port_count(&engine->execlists.port[1]); > >>> + engine->stats.active += port_count(&engine->execlists.port[0]); > >> > >> Hm, wouldn't this have the potential to over-account the active counter > >> resulting in permanent 100%? Port count is [0, 2], where 2 is > >> re-submission of port 0, while engine->stats.active is [0, 2], where > >> this 2 is number of ports. In other words I think the correct method > >> would be: > >> > >> if (port_count(0)) > >> active++; > >> if (port_count(1)) > >> active++; > > > > I thought we needed for stats.active to reflect the number of pending > > context_out we will report. So for the lite-restore case, I thought 2 > > was the correct value. > > Hm, so am I mistaken that we would get still only one context_out > (port->count to zero transition) from port0 in lite-restore case? I > can't spot it in the code at the moment that I am, but who knows. Nope, my mistake, It is just the transition to/from count==0 that are relevant. More reason to try and hit the lite-restore window :) -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx