On 11/11/2019 11:40, Chris Wilson wrote:
We cannot control how long RCU takes to find a quiescent point as that
depends upon the background load and so may take an arbitrary time.
Instead, let's try to avoid that impacting our measurements by inserting
an rcu_barrier() before our critical timing sections and hope that hides
the issue, letting us always perform a fast reset. Fwiw, we do the
expedited RCU synchronize, but that is not always enough.
Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
---
tests/i915/gem_eio.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/tests/i915/gem_eio.c b/tests/i915/gem_eio.c
index 8d6cb9760..49d2a99e9 100644
--- a/tests/i915/gem_eio.c
+++ b/tests/i915/gem_eio.c
@@ -71,6 +71,7 @@ static void trigger_reset(int fd)
{
struct timespec ts = { };
+ rcu_barrier(fd); /* flush any excess work before we start timing */
igt_nsec_elapsed(&ts);
igt_kmsg(KMSG_DEBUG "Forcing GPU reset\n");
@@ -227,6 +228,10 @@ static void hang_handler(union sigval arg)
igt_debug("hang delay = %.2fus\n",
igt_nsec_elapsed(&ctx->delay) / 1000.0);
+ /* flush any excess work before we start timing our reset */
+ igt_assert(igt_sysfs_printf(ctx->debugfs, "i915_drop_caches",
+ "%d", DROP_RCU));
+
igt_nsec_elapsed(ctx->ts);
igt_assert(igt_sysfs_set(ctx->debugfs, "i915_wedged", "-1"));
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
Avoid scoring demerit points if you add reference to bugzilla,
presumably linking to CI results, showing this was known to be flaky. :)
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx