Quoting Mika Kuoppala (2019-10-24 08:21:14) > Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: > > > When setting up the system to perform the atomic reset, we need to > > serialise with any ongoing interrupt tasklet or else: > > > > <0> [472.951428] i915_sel-4442 0d..1 466527056us : __i915_request_submit: rcs0 fence 11659:2, current 0 > > <0> [472.951554] i915_sel-4442 0d..1 466527059us : __execlists_submission_tasklet: rcs0: queue_priority_hint:-2147483648, submit:yes > > <0> [472.951681] i915_sel-4442 0d..1 466527061us : trace_ports: rcs0: submit { 11659:2, 0:0 } > > <0> [472.951805] i915_sel-4442 0.... 466527114us : __igt_atomic_reset_engine: i915_reset_engine(rcs0:active) under hardirq > > <0> [472.951932] i915_sel-4442 0d... 466527115us : intel_engine_reset: rcs0 flags=11d > > <0> [472.952056] i915_sel-4442 0d... 466527117us : execlists_reset_prepare: rcs0: depth<-1 > > <0> [472.952179] i915_sel-4442 0d... 466527119us : intel_engine_stop_cs: rcs0 > > <0> [472.952305] <idle>-0 1..s1 466527119us : process_csb: rcs0 cs-irq head=3, tail=4 > > Racing and this shows from old world? We have the same CSB events being seen by process_csb() on two different processors. One being issued by the reset in the test, the other by the interrupt; this scenario is supposed to be prevented by flushing the interrupt tasklet with tasklet_disable() before we enter the atomic reset -- but I copied the code to use tasklet_disable_nosync() that is meant to only used from inside the atomic reset after we had serialised (or know we are inside the tasklet) with the tasklet. Basically this bug is of our own invention because we are bypassing the usual setup in order to do engine->reset() from unusual conditions. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx