On Fri, Mar 24, 2017 at 06:29:56PM -0700, Michel Thierry wrote: > From: Arun Siluvery <arun.siluvery@xxxxxxxxxxxxxxx> > > This is a preparatory patch which modifies error handler to do per engine > hang recovery. The actual patch which implements this sequence follows > later in the series. The aim is to prepare existing recovery function to > adapt to this new function where applicable (which fails at this point > because core implementation is lacking) and continue recovery using legacy > full gpu reset. > > A helper function is also added to query the availability of engine > reset. > > The error events behaviour that are used to notify user of reset are > adapted to engine reset such that it doesn't break users listening to these > events. In legacy we report an error event, a reset event before resetting > the gpu and a reset done event marking the completion of reset. The same > behaviour is adapted but reset event is only dispatched once even when > multiple engines are hung. Finally once reset is complete we send reset > done event as usual. > > Note that this implementation of engine reset is for i915 directly > submitting to the ELSP, where the driver manages the hang detection, > recovery and resubmission. With GuC submission these tasks are shared > between driver and firmware; i915 will still responsible for detecting a > hang, and when it does it will have to request GuC to reset that Engine and > remind the firmware about the outstanding submissions. This will be > added in different patch. > > v2: rebase, advertise engine reset availability in platform definition, > add note about GuC submission. > v3: s/*engine_reset*/*reset_engine*/. (Chris) > Handle reset as 2 level resets, by first going to engine only and fall > backing to full/chip reset as needed, i.e. reset_engine will need the > struct_mutex. > v4: Pass the engine mask to i915_reset. (Chris) > v5: Rebase, update selftests. > > Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> > Signed-off-by: Ian Lister <ian.lister@xxxxxxxxx> > Signed-off-by: Tomas Elf <tomas.elf@xxxxxxxxx> > Signed-off-by: Arun Siluvery <arun.siluvery@xxxxxxxxxxxxxxx> > Signed-off-by: Michel Thierry <michel.thierry@xxxxxxxxx> 4 authors in and this patch is still trying to do reset_engine until the mutex, requiring the handoff. Why? We should be ready now to be able to do the first pass of resets before the mutex - we don't need to even prepare the display for the per-engine resets and should be able to do it much lighter. We can land the per-engine struct_mutex resets for hangcheck as soon as it is ready as it would see immediate testing. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx