Quoting Tvrtko Ursulin (2019-09-23 15:29:11) > > On 20/09/2019 23:26, Chris Wilson wrote: > > The expectation for bonded submission is that they are run concurrently, > > in parallel on multiple engines. However, given a lack of constraints in > > the scheduler's selection combined with timeslicing could mean that the > > bonded requests could be run in opposite order on the same engine. With > > just the right pair of requests, this can cause a GPU hang (or at least > > trigger hangchecker), best (worst) case would be execution running > > several times slower than ideal. > > I don't see any bonding being setup? > > [comes back later] > > Oh you used only the submit fence and not actually bonds. But you also > don't use the virtual engine at all? A is using either of the 2 real engines, B is using the virtual engine to select the other available engine. Bonding in this case is just that the requests are bonded together to run in parallel. > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > > --- > > tests/i915/gem_exec_balancer.c | 151 +++++++++++++++++++++++++++++++++ > > 1 file changed, 151 insertions(+) > > > > diff --git a/tests/i915/gem_exec_balancer.c b/tests/i915/gem_exec_balancer.c > > index 407dc0eca..e4fe75747 100644 > > --- a/tests/i915/gem_exec_balancer.c > > +++ b/tests/i915/gem_exec_balancer.c > > @@ -30,6 +30,15 @@ > > > > IGT_TEST_DESCRIPTION("Exercise in-kernel load-balancing"); > > > > +#define MI_SEMAPHORE_WAIT (0x1c << 23) > > +#define MI_SEMAPHORE_POLL (1 << 15) > > +#define MI_SEMAPHORE_SAD_GT_SDD (0 << 12) > > +#define MI_SEMAPHORE_SAD_GTE_SDD (1 << 12) > > +#define MI_SEMAPHORE_SAD_LT_SDD (2 << 12) > > +#define MI_SEMAPHORE_SAD_LTE_SDD (3 << 12) > > +#define MI_SEMAPHORE_SAD_EQ_SDD (4 << 12) > > +#define MI_SEMAPHORE_SAD_NEQ_SDD (5 << 12) > > + > > #define INSTANCE_COUNT (1 << I915_PMU_SAMPLE_INSTANCE_BITS) > > > > static size_t sizeof_load_balance(int count) > > @@ -694,6 +703,145 @@ static void bonded(int i915, unsigned int flags) > > gem_context_destroy(i915, master); > > } > > > > +static unsigned int offset_in_page(void *addr) > > +{ > > + return (uintptr_t)addr & 4095; > > +} > > + > > +static uint32_t create_semaphore_to_spinner(int i915, igt_spin_t *spin) > > +{ > > + uint32_t *cs, *map; > > + uint32_t handle; > > + > > + handle = gem_create(i915, 4096); > > + cs = map = gem_mmap__cpu(i915, handle, 0, 4096, PROT_WRITE); > > + > > + /* Wait until the spinner is running */ > > + *cs++ = MI_SEMAPHORE_WAIT | > > + MI_SEMAPHORE_POLL | > > + MI_SEMAPHORE_SAD_NEQ_SDD | > > + (4 - 2); > > + *cs++ = 0; > > + *cs++ = spin->obj[0].offset + 4 * SPIN_POLL_START_IDX; > > + *cs++ = 0; > > + > > + /* Then cancel the spinner */ > > + *cs++ = MI_STORE_DWORD_IMM; > > + *cs++ = spin->obj[IGT_SPIN_BATCH].offset + > > + offset_in_page(spin->condition); > > + *cs++ = 0; > > + *cs++ = MI_BATCH_BUFFER_END; > > + > > + *cs++ = MI_BATCH_BUFFER_END; > > + munmap(map, 4096); > > + > > + return handle; > > +} > > + > > +static void bonded_slice(int i915) > > +{ > > + uint32_t ctx; > > + int *stop; > > + > > + igt_require(gem_scheduler_has_semaphores(i915)); > > + > > + stop = mmap(0, 4096, PROT_WRITE, MAP_SHARED | MAP_ANON, -1, 0); > > + igt_assert(stop != MAP_FAILED); > > + > > + ctx = gem_context_create(i915); > > + > > + for (int class = 0; class < 32; class++) { > > + struct i915_engine_class_instance *siblings; > > + struct drm_i915_gem_exec_object2 obj[3] = {}; > > + struct drm_i915_gem_execbuffer2 eb = {}; > > + unsigned int count; > > + igt_spin_t *spin; > > + > > + siblings = list_engines(i915, 1u << class, &count); > > + if (!siblings) > > + continue; > > + > > + if (count < 2) { > > + free(siblings); > > + continue; > > + } > > + > > + /* > > + * A: semaphore wait on spinner; cancel spinner > > + * B: unpreemptable spinner > > + * > > + * A waits for running ack from B, if scheduled on the same > > + * engine -> hang. > > + * > > + * C+: background load across engines > > + */ > > + > > + set_load_balancer(i915, ctx, siblings, count, NULL); > > + > > + spin = __igt_spin_new(i915, > > + .ctx = ctx, > > + .flags = (IGT_SPIN_NO_PREEMPTION | > > + IGT_SPIN_POLL_RUN)); > > + igt_spin_end(spin); /* we just want its address for later */ > > + gem_sync(i915, spin->handle); > > + igt_spin_reset(spin); > > + > > + obj[0] = spin->obj[0]; > > + obj[1] = spin->obj[1]; > > + obj[2].handle = create_semaphore_to_spinner(i915, spin); > > + > > + eb.buffers_ptr = to_user_pointer(obj); > > + eb.rsvd1 = ctx; > > + > > + *stop = 0; > > + igt_fork(child, count + 1) { > > + igt_list_del(&spin->link); > > + > > + ctx = gem_context_clone(i915, ctx, > > + I915_CONTEXT_CLONE_ENGINES, 0); > > + > > + while (!READ_ONCE(*stop)) { > > + spin = igt_spin_new(i915, > > + .ctx = ctx, > > + .engine = (1 + rand() % count), > > With "count + 1" children and rand load my end up uneven across engines > - are you happy with that? It's using rand, it's going to be uneven. count + 1 isn't significant in any way. I stopped as soon as I had the test reliably hitting the issue. > > + .flags = IGT_SPIN_POLL_RUN); > > + igt_spin_busywait_until_started(spin); > > + usleep(50000); > > 50ms, hm, ideally there should be a pipe signal before parent starts the > test to know children have started. Otherwise parent can finish before > they even start, no? The children are just to provide noise. The requirement is that we have enough load across the system to cause timeslicing to kick in. > > + igt_spin_free(i915, spin); > > + } > > + > > + gem_context_destroy(i915, ctx); > > + } > > + > > + igt_until_timeout(5) { > > + igt_spin_reset(spin); > > What is the reset for? We are reusing the spinner inside the loop. > > + > > + /* A: Submit the semaphore wait */ > > + eb.buffer_count = 3; > > + eb.flags = (1 + rand() % count) | I915_EXEC_FENCE_OUT; > > + gem_execbuf_wr(i915, &eb); > > + > > + /* B: Submit the spinner (in parallel) */ > > How in parallel when it is the same context so they are implicitly in order? Different engines with different timelines, using the submit to request parallel execution. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx