Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: > Quoting Mika Kuoppala (2018-08-13 11:42:42) >> If engine reports that it is not ready for reset, we >> give up. Evidence shows that forcing a per engine reset >> on an engine which is not reporting to be ready for reset, >> can bring it back into a working order. There is risk that >> we corrupt the context image currently executing on that >> engine. But that is a risk worth taking as if we unblock >> the engine, we prevent a whole device wedging in a case >> of full gpu reset. >> >> Reset individual engine even if it reports that it is not >> prepared for reset, but only if we aim for full gpu reset >> and not on first reset attempt. >> >> v2: force reset only on later attempts, readability (Chris) >> v3: simplify with adequate caffeine levels (Chris) >> >> Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> >> Signed-off-by: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> > Reviewed-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > One last thing, you said you recalled one of the reasons for its > existence was to prevent machine lockups on kbl. Is the recollection > true? Do we want to leave a comment in case of fire? We got machine lockups if we did reset a non stopped, active engine inside a batchbuffer. i915_stop_engines() arise from that and we have a comment in intel_gpu_reset explaining it. That lockup did apparently happen regardless of ready-to-reset ack. How I read it is that we got ready-to-reset acks on active engines, which then died if we proceed. So this patch should not make things worse as i915_stop_engines have hold water. -Mika *knocks on wood* _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx