Quoting Antonio Argenziano (2017-08-15 22:44:21) > Sending as RFC to get feedback on what would be the correct behaviour of > the driver and, therefore, if the test is valid. It's not a preemption specific bug. It is fair to say that any client blocking any other client over a non-contended resource is an issue. Skip to end for a really easy way to demonstrate this. > We do a wait while holding the mutex if we are adding a request and figure > out that there is no more space in the ring buffer. > Is that considered a bug? Yes, but it is one of many priority inversion problems we have because we have a BKL. Timeframe for fixing it is a few more years. > +static void wait_batch(int fd, uint32_t handle) > +{ > + int64_t timeout = 1ull * NSEC_PER_SEC; //1 sec > + > + if(gem_wait(fd, handle, &timeout) != 0) { > + //Force reset and fail the test > + igt_force_gpu_reset(fd); Just terminate the spin batches. > + igt_assert_f(0, "[0x%x] batch did not complete!", handle); > + } > +} > + > +/* > + * This test checks that is possible for a high priority request to trigger a > + * preemption if another context has filled its ringbuffer. > + * The aim of the test is to make sure that high priority requests cannot be > + * stalled by low priority tasks. > + * */ > +static void preempt_while_ringbuffer_full(int fd, uint32_t engine) > +{ > + uint32_t hp_ctx, lp_ctx; > + uint32_t hp_batch; > + igt_spin_t *lp_batch; > + > + struct drm_i915_gem_exec_object2 obj[2]; > + struct drm_i915_gem_relocation_entry reloc[1024]; That's a bit excessive for this pi test, no ? > + struct drm_i915_gem_execbuffer2 execbuf; > + const unsigned timeout = 10; > + > + lp_ctx = gem_context_create(fd); > + ctx_set_priority(fd, lp_ctx, -MAX_PRIO); > + > + hp_ctx = gem_context_create(fd); > + ctx_set_priority(fd, hp_ctx, MAX_PRIO); > + > + igt_require(setup_execbuf(fd, &execbuf, obj, reloc, engine) == 0); > + execbuf.rsvd1 = lp_ctx; > + > + /*Stall execution and fill ring*/ > + lp_batch = igt_spin_batch_new(fd, lp_ctx, engine, 0); > + igt_fork(child_no, 1) { > + fill_ring(fd, &execbuf, 0, timeout); > + } > + > + /*We don't know when the ring will be full so keep sending in a loop*/ Yes we do. See measure_ring_size. static void bind_to_cpu(int cpu) { const int ncpus = sysconf(_SC_NPROCESSORS_ONLN); struct sched_param rt = {.sched_priority = 99 }; cpu_set_t allowed; igt_assert(sched_setscheduler(getpid(), SCHED_RR | SCHED_RESET_ON_FORK, &rt) == 0); CPU_ZERO(&allowed); CPU_SET(cpu % ncpus, &allowed); igt_assert(sched_setaffinity(getpid(), sizeof(cpu_set_t), &allowed) == 0); } setup timer execbuf.rsvd1 = ctx_lo; while (__raw_gem_execbuf(fd, &execbuf) == 0) ; /* steal cpu */ bind_to_cpu(0); igt_fork(child, 1) { /* this child is forced to wait for parent to sleep */ execbuf.rsvd1 = ctx_hi; setup timer; *success = __raw_gem_execbuf(fd, &execbuf) == 0; } setup 2*timer __raw_gem_execbuf(fd, &execbuf); /* sleeps under mutex, releasing child */ igt_terminate_spin_batches(); igt_waitchildren(); igt_assert(*success); Timer can be safely 10ms. Similarly: race set-domain (pretty much any GEM ioctl ends up in set-domain) vs spin-batch, when successful then try any set-domain ioctl from a second client and observe that it too is blocked on the first client hogging. end: For the purpose of testing, just create a debugfs that acquires struct_mutex on opening. Then test every ioctl and trap from a second client. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx