Re: [PATCH i-g-t 2/2] tests/perf_pmu: Exercise busy stats and lite-restore

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 09/01/2018 21:39, Chris Wilson wrote:
Quoting Tvrtko Ursulin (2018-01-09 16:16:21)
From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>

While developing a fix for an accounting hole in busy stats we realized
lite-restore is a potential edge case which would be interesting to check
is properly handled.

It is unfortnately quite timing sensitive to hit lite-restore in the
fashion test needs, so downside of this test is that it sufferes from a
high rate of false negatives.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
---
  tests/perf_pmu.c | 85 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  1 file changed, 85 insertions(+)

diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
index e1f449d48808..7db5059d202e 100644
--- a/tests/perf_pmu.c
+++ b/tests/perf_pmu.c
@@ -192,6 +192,84 @@ busy_start(int gem_fd, const struct intel_execution_engine2 *e)
         gem_quiescent_gpu(gem_fd);
  }
+/*
+ * This test has a potentially low rate of catching the issue it is trying to
+ * catch. Or in other words, quite high rate of false negative successes. We
+ * will depend on the CI systems running it a lot to detect issues.
+ */
+static void
+busy_double_start(int gem_fd, const struct intel_execution_engine2 *e)
+{
+       unsigned long slept;
+       igt_spin_t *spin[2];
+       uint64_t val, val2;
+       bool execlists;
+       uint32_t ctx;
+       int fd;
+
+       ctx = gem_context_create(gem_fd);
+
+       if (gem_has_execlists(gem_fd)) {
+               /*
+                * Defeat the busy stats delayed disable, we need to guarantee
+                * we are the first user.
+                */
+               execlists = true;
+               sleep(2);
+       } else {
+               execlists = false;
+       }
+
+       /*
+        * Submit two contexts, with a pause in between targeting the ELSP
+        * re-submission in execlists mode. Make sure busyness is correctly
+        * reported with the engine busy, and after the engine went idle.
+        */
+       spin[0] = __igt_spin_batch_new(gem_fd, 0, e2ring(gem_fd, e), 0);
+       usleep(500000);
+       spin[1] = __igt_spin_batch_new(gem_fd, ctx, e2ring(gem_fd, e), 0);

If you want to avoid the second spinner, you can resubmit the first with
a second context.

I know you like that but I prefer the pedestrian way. :)

+       /*
+        * Open PMU as fast as possible after the second spin batch in attempt
+        * to be faster than the driver handling lite-restore.
+        */
+       fd = open_pmu(I915_PMU_ENGINE_BUSY(e->class, e->instance));
+
+       slept = measured_usleep(batch_duration_ns / 1000);
+       val = pmu_read_single(fd);
+
+       igt_spin_batch_end(spin[0]);
+       igt_spin_batch_end(spin[1]);
+
+       gem_sync(gem_fd, spin[0]->handle);
+       gem_sync(gem_fd, spin[1]->handle);
+
+       /* Wait for context complete to avoid accounting residual busyness. */
+       if (execlists)
+               usleep(100000);

Then you want a gem_quiescent_gpu() instead of the gem_sync()? Though
I'm scratching my head here as to what you mean. If there's an issue
here, won't we see it in our other measurements as well?

Primarily because we want to measure idle after busy here and not in other tests, *and*, with execlists time between user interrupt and context save counts as busyness. So without this delay it is possible to observe a few micro-second of engine busyness in the below check.

+       val2 = pmu_read_single(fd);
+       usleep(batch_duration_ns / 1000);
+       val2 = pmu_read_single(fd) - val2;

But the engine is idle at this point? Why sleep for execlists and not
legacy? Isn't the execlists case where we know that just by waiting for
idle (gem_quiescent_gpu() not an arbitrary usleep) then the pmu counter
is idle. It's legacy where we would have to wait for an sample interval
following gpu idle to be sure the pmu is idle.

Hm legacy and sampling.. I think we don't have to wait since there is no context save there is the idleness comes straight with the user interrupt, or no?

I think I half understand what you want to assert (an idle gpu after
being busy, reports idle?) but I'm not entirely sure. :)

Yep!

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux