Re: [PATCH v3] drm/i915/pmu: Reconstruct active state on starting busy-stats

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 12/01/2018 09:09, Chris Wilson wrote:
Quoting Tvrtko Ursulin (2018-01-12 08:16:19)

On 11/01/2018 07:30, Chris Wilson wrote:
We have a hole in our busy-stat accounting if the pmu is enabled during
a long running batch, the pmu will not start accumulating busy-time
until the next context switch. This then fails tests that are only
sampling a single batch.

v2: Count each active port just once (context in/out events are only on
the first and last assigment to a port).
v3: Avoid hardcoding knowlege of 2 submission ports

Fixes: 30e17b7847f5 ("drm/i915: Engine busy time tracking")
Testcase: igt/perf_pmu/busy-start
Testcase: igt/perf_pmu/busy-double-start
Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
---
   drivers/gpu/drm/i915/intel_engine_cs.c | 16 +++++++++++++++-
   1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 6bb51a502b8b..d790bdc227ff 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -1951,8 +1951,22 @@ int intel_enable_engine_stats(struct intel_engine_cs *engine)
       spin_lock_irqsave(&engine->stats.lock, flags);
       if (engine->stats.enabled == ~0)
               goto busy;
-     if (engine->stats.enabled++ == 0)
+     if (engine->stats.enabled++ == 0) {
+             struct intel_engine_execlists *execlists = &engine->execlists;
+             const struct execlist_port *port = execlists->port;
+             unsigned int num_ports = execlists_num_ports(execlists);
+
               engine->stats.enabled_at = ktime_get();
+
+             /* XXX submission method oblivious? */
+             while (num_ports-- && port_isset(port)) {
+                     engine->stats.active++;
+                     port++;
+             }

Argh, engine->timeline->lock is required to safely to this. But it needs
to be outside the engine->stats.lock. I can't think of any problems
doing it at the moment. What do you think?

This isn't even protected by engine->timeline->lock. Writes to the ports
are serialised by the tasklet alone. Note that this is a pure read, so

Yes, my bad.

it won't explode, just potentially miscalculate active and never end.
However, if you look at the other side, update of stats.active by the
tasklet is serialised by stats.lock, is that enough to argue with?

Nice one for early morning.. :)

port0 context complete
context_out - not enabled, no-op
stats enable - port0 busy, active = 1
port0 clear
submit
context_in - active = 2 !!! BAD
port0 set

So a problem, no? Do we need to change the ordering of port updates vs context_in/out calls?

port0 context complete
stats enable - port0 busy, active = 1
a) context_out - enabled, active = 0
port0 clear
b) context_out - enabled, active = 0
submit
context_in - active = 1 GOOD
port0 set

From the submit side, current ordering:

port0 submit
context_in - not enabled, no-op
stats enable - port0 not busy, active = 0
port0 set
port0 context complete
context_out - active = -1 !!! BAD

Opposite ordering:

port0 submit

a) stats enable - port0 not busy, active = 0
a) port0 set
a) context_in - active = 1 OK

b) port0 set
b) stats enable - port0 busy, active = 1
b) context_in - active = 2 !!! BAD

So submit side doesn't work in either case, unless I am missing something. Would need the pair of port manipulation and context_in to be atomic.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux