Re: [PATCH i-g-t] i915/gem_eio: Flush RCU before timing our own critical sections

Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> · Mon, 11 Nov 2019 15:49:32 +0000

On 11/11/2019 11:40, Chris Wilson wrote:
We cannot control how long RCU takes to find a quiescent point as that
depends upon the background load and so may take an arbitrary time.
Instead, let's try to avoid that impacting our measurements by inserting
an rcu_barrier() before our critical timing sections and hope that hides
the issue, letting us always perform a fast reset. Fwiw, we do the
expedited RCU synchronize, but that is not always enough.

Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
---
  tests/i915/gem_eio.c | 5 +++++
  1 file changed, 5 insertions(+)

diff --git a/tests/i915/gem_eio.c b/tests/i915/gem_eio.c
index 8d6cb9760..49d2a99e9 100644
--- a/tests/i915/gem_eio.c
+++ b/tests/i915/gem_eio.c
@@ -71,6 +71,7 @@ static void trigger_reset(int fd)
  {
  	struct timespec ts = { };
  
+	rcu_barrier(fd); /* flush any excess work before we start timing */
  	igt_nsec_elapsed(&ts);
  
  	igt_kmsg(KMSG_DEBUG "Forcing GPU reset\n");
@@ -227,6 +228,10 @@ static void hang_handler(union sigval arg)
  	igt_debug("hang delay = %.2fus\n",
  		  igt_nsec_elapsed(&ctx->delay) / 1000.0);
  
+	/* flush any excess work before we start timing our reset */
+	igt_assert(igt_sysfs_printf(ctx->debugfs, "i915_drop_caches",
+				    "%d", DROP_RCU));
+
  	igt_nsec_elapsed(ctx->ts);
  	igt_assert(igt_sysfs_set(ctx->debugfs, "i915_wedged", "-1"));
  


Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>

Avoid scoring demerit points if you add reference to bugzilla, 
presumably linking to CI results, showing this was known to be flaky. :)

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx