Quoting Michał Winiarski (2017-09-18 09:53:50) > On Sat, Sep 16, 2017 at 09:44:11PM +0100, Chris Wilson wrote: > > During a reset, we may skip over completed requests and lost > > context-switch interrupts. Following the reset, we may then may end up > > with no active requests in the ELSP (and so do not resubmit to restart > > the engine), but have a queue of requests ready for execution. This is > > unlikely, it requires the last request to complete after the hang is > > detected, but not impossible. The outcome of this is that the engine > > stalls, possibly leading to full ring and indefinite wait under > > struct_mutex, eventually leading to a full driver hang. > > > > Alternatively, we can solve this by unsubmitting the incomplete requests > > and just kickstarting the tasklet. Michał has patches for that, which I > > initially disliked due to the extra complexity, but the complexity of > > this "simple" restart is growing... > > You are doing exactly that in 4/4. > Perhaps squash the two together to avoid moving code around, although this one > is a genuine fix, so I guess it's also fine on its own. It was a fix that introduced the concept of calling tasklet_schedule during restart, which is then expanded on by 4/4 to do everything. I liked the progression. > If you rebase the whole thing on top of coalesced GuC requests (which now is all > reviewed and ready to be merged), we'll have uniform reset handling for GuC > and execlists. Bugfix wins :-p Are you happy if I pull in the coalesced guc requests with this amendment: @@ -1181,7 +1182,7 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv) */ engine->irq_tasklet.func = i915_guc_irq_handler; clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted); - i915_guc_submit(engine); + tasklet_schedule(&engine->irq_tasklet); } return 0; with the desc->tail fix, guc has been stable for a day of mixed hang testing. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx