Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: > Quoting Mika Kuoppala (2018-07-09 15:13:44) >> Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: >> >> > Across a reset, the seqno (and thus hangcheck) should restart and the >> > hangcheck naturally progress, for when it does not, we want to declare an >> > emergency. Currently, we only detect if reset and reinit fails, but we >> > do not detect if the call to reinit succeeds but the HW is fried - as we >> > are resetting hangcheck on initialisation the engine. Remove that and >> > rely on the natural progress to reset the hangcheck timer. >> >> I take it that the intention is not to give reset >> any special leeway wrt to request completion. So >> we now assume that reset/recovery must fit inside >> one hangcheck tick? > > We call the synchronous i915_handle_error() from inside hangcheck, so we > know the reset is completed before we schedule the next tick. So yes it > seems fair that the recovery should always be expected to complete > within that tick as we would expect any other batch to complete (and the > recovery request is just to advance the breadcrumb, no batch). > > So yes, reset/recovery must fit inside the tick. Worthy goal. And yes it explains the natural progression in the commit message. Reviewed-by: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx