Quoting Tvrtko Ursulin (2018-01-10 10:37:51) > > On 09/01/2018 21:39, Chris Wilson wrote: > > Quoting Tvrtko Ursulin (2018-01-09 16:16:21) > >> From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > >> > >> While developing a fix for an accounting hole in busy stats we realized > >> lite-restore is a potential edge case which would be interesting to check > >> is properly handled. > >> > >> It is unfortnately quite timing sensitive to hit lite-restore in the > >> fashion test needs, so downside of this test is that it sufferes from a > >> high rate of false negatives. > >> > >> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > >> --- > >> tests/perf_pmu.c | 85 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >> 1 file changed, 85 insertions(+) > >> > >> diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c > >> index e1f449d48808..7db5059d202e 100644 > >> --- a/tests/perf_pmu.c > >> +++ b/tests/perf_pmu.c > >> @@ -192,6 +192,84 @@ busy_start(int gem_fd, const struct intel_execution_engine2 *e) > >> gem_quiescent_gpu(gem_fd); > >> } > >> > >> +/* > >> + * This test has a potentially low rate of catching the issue it is trying to > >> + * catch. Or in other words, quite high rate of false negative successes. We > >> + * will depend on the CI systems running it a lot to detect issues. > >> + */ > >> +static void > >> +busy_double_start(int gem_fd, const struct intel_execution_engine2 *e) > >> +{ > >> + unsigned long slept; > >> + igt_spin_t *spin[2]; > >> + uint64_t val, val2; > >> + bool execlists; > >> + uint32_t ctx; > >> + int fd; > >> + > >> + ctx = gem_context_create(gem_fd); > >> + > >> + if (gem_has_execlists(gem_fd)) { > >> + /* > >> + * Defeat the busy stats delayed disable, we need to guarantee > >> + * we are the first user. > >> + */ > >> + execlists = true; > >> + sleep(2); > >> + } else { > >> + execlists = false; > >> + } > >> + > >> + /* > >> + * Submit two contexts, with a pause in between targeting the ELSP > >> + * re-submission in execlists mode. Make sure busyness is correctly > >> + * reported with the engine busy, and after the engine went idle. > >> + */ > >> + spin[0] = __igt_spin_batch_new(gem_fd, 0, e2ring(gem_fd, e), 0); > >> + usleep(500000); > >> + spin[1] = __igt_spin_batch_new(gem_fd, ctx, e2ring(gem_fd, e), 0); > > > > If you want to avoid the second spinner, you can resubmit the first with > > a second context. > > I know you like that but I prefer the pedestrian way. :) > > >> + /* > >> + * Open PMU as fast as possible after the second spin batch in attempt > >> + * to be faster than the driver handling lite-restore. > >> + */ > >> + fd = open_pmu(I915_PMU_ENGINE_BUSY(e->class, e->instance)); > >> + > >> + slept = measured_usleep(batch_duration_ns / 1000); > >> + val = pmu_read_single(fd); > >> + > >> + igt_spin_batch_end(spin[0]); > >> + igt_spin_batch_end(spin[1]); > >> + > >> + gem_sync(gem_fd, spin[0]->handle); > >> + gem_sync(gem_fd, spin[1]->handle); > >> + > >> + /* Wait for context complete to avoid accounting residual busyness. */ > >> + if (execlists) > >> + usleep(100000); > > > > Then you want a gem_quiescent_gpu() instead of the gem_sync()? Though > > I'm scratching my head here as to what you mean. If there's an issue > > here, won't we see it in our other measurements as well? > > Primarily because we want to measure idle after busy here and not in > other tests, *and*, with execlists time between user interrupt and > context save counts as busyness. So without this delay it is possible to > observe a few micro-second of engine busyness in the below check. Ok, so with a gem_quiescent_gpu() we shouldn't need an arbitrary sleep, as we are stating the gpu is then idle (ELSP drained etc). > >> + val2 = pmu_read_single(fd); > >> + usleep(batch_duration_ns / 1000); > >> + val2 = pmu_read_single(fd) - val2; > > > > But the engine is idle at this point? Why sleep for execlists and not > > legacy? Isn't the execlists case where we know that just by waiting for > > idle (gem_quiescent_gpu() not an arbitrary usleep) then the pmu counter > > is idle. It's legacy where we would have to wait for an sample interval > > following gpu idle to be sure the pmu is idle. > > Hm legacy and sampling.. I think we don't have to wait since there is no > context save there is the idleness comes straight with the user > interrupt, or no? Hmm. Right. Since the sample after completion doesn't add any time to the counter, you only need to ensure that pmu_read_single() is after the seqno completion (i.e. gem_sync) to assert it should increment anymore. Ok. I think you can just remove the > >> + gem_sync(gem_fd, spin[0]->handle); > >> + gem_sync(gem_fd, spin[1]->handle); > >> + > >> + /* Wait for context complete to avoid accounting residual busyness. */ > >> + if (execlists) > >> + usleep(100000); with a gem_quiescent_gpu() and everybody is happy. With that tweak, Reviewed-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> though I'd be happy if the initial sleep(2) was unconditional (pending suitable information exported from the kernel). -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx