Quoting Antonio Argenziano (2017-10-13 21:49:29) > A client that submits 'bad' contexts will be banned eventually while > other clients are not affected. Add a test for this. > > v2 > - Do not use a fixed number of contexts to ban client (Chris) > > Signed-off-by: Antonio Argenziano <antonio.argenziano@xxxxxxxxx> > > Cc: Michel Thierry <michel.thierry@xxxxxxxxx> > Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > --- > tests/gem_reset_stats.c | 58 +++++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 58 insertions(+) > > diff --git a/tests/gem_reset_stats.c b/tests/gem_reset_stats.c > index edc40767..da309237 100644 > --- a/tests/gem_reset_stats.c > +++ b/tests/gem_reset_stats.c > @@ -438,6 +438,61 @@ static void test_ban_ctx(const struct intel_execution_engine *e) > close(fd); > } > > +static void test_client_ban(const struct intel_execution_engine *e) > +{ > + int fd_bad,fd_good; > + struct local_drm_i915_reset_stats rs_bad, rs_good; > + int ban, ctx_bans_retry = 10; > + int client_ban, client_bans_retry = 10; > + uint32_t ctx_bad; > + uint32_t test_ctx; > + int active_count = 0; /* * We try to prevent DoS GPU hang attacks from one client affecting the * service of other clients by banning the malicious client. * * To simulate this, we have one client submit a sequence of GPU hangs * that we expect to be banned (the client will no longer be allowed * to submit new work). Meanwhile a second client will not be affected * and allowed to continue. * * Note this is an extremely simplistic mechanism. For example, each * context inside an fd may represent a separate client (e.g. WebGL) * in which case the per-fd ban allow one client to DoS a second. * On the other hand, a malicious client may be creating a new context * for each hang to circumvent per-context restrictions. It may even be * creating a new fd each time to circumvent the per-fd restrictions. * Finally, the most malicious of all clients may succeed in wedging * the driver, bringing everyone to a standstill. In short, this DoS * prevention is itself suspect. * * So given the above what exactly to we want to define as the * expected behaviour? The answer is none of the above; use a * timeslicing scheduler and no banning. Any context is then allowed to * use as much as its time slice spinning as it wants; it won't be * allowed to steal GPU time from other processes. DoS averted. */ I keep coming to the conclusion that banning is a temporary hack, and not mandatory uABI, like hangcheck. The usual answer is that we then describe the implementation's behaviour through say a CONTEXT_GETPARAM and let userspace factor that into account. But again, I to have ask what requirements has userspace? https://www.khronos.org/registry/OpenGL/extensions/ARB/ARB_robustness.txt (And how are the requirements going to change?) The implication from that is that we could ban a context after the first hang, but mesa isn't ready for that! Maybe you should try writing an explanation of exactly what behaviour you are trying to elicit here and why? :) One thing that seems nice on the surface for execlists is repriotising after a hang to allow all innocent parties to make progress at the expense of the guilty. You will probably think of a few other approaches that make sense as you think about what userspace expects. > + fd_bad = drm_open_driver(DRIVER_INTEL); > + fd_good = drm_open_driver(DRIVER_INTEL); > + > + assert_reset_status(fd_bad, fd_bad, 0, RS_NO_ERROR); > + assert_reset_status(fd_good, fd_good, 0, RS_NO_ERROR); > + > + while (client_bans_retry--) { > + client_ban = __gem_context_create(fd_bad, &ctx_bad); > + if (client_ban == -EIO) > + break; > + > + noop(fd_bad, ctx_bad, e); > + assert_reset_status(fd_bad, fd_bad, ctx_bad, RS_NO_ERROR); > + > + ctx_bans_retry = 10; > + active_count = 0; > + while (ctx_bans_retry--) { > + inject_hang(fd_bad, ctx_bad, e, BAN); > + active_count++; > + > + ban = noop(fd_bad, ctx_bad, e); > + if (ban == -EIO) > + break; There's a major variation in submitting a queue of hangs that also needs to be taken into account (with different code paths in the kernel). -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx