Quoting Jeff McGee (2017-08-28 20:46:00) > On Mon, Aug 28, 2017 at 12:41:58PM -0700, Michel Thierry wrote: > > On 28/08/17 12:25, jeff.mcgee@xxxxxxxxx wrote: > > >From: Jeff McGee <jeff.mcgee@xxxxxxxxx> > > > > > >If someone else is resetting the engine we should clear our own bit as > > >part of skipping that engine. Otherwise we will later believe that it > > >has not been reset successfully and then trigger full gpu reset. If the > > >other guy's reset actually fails, he will trigger the full gpu reset. > > > > > > > Did you hit this by manually setting wedged to 'x' ring repeatedly? > > > I haven't actually reproduced it. Have just been looking at the code a > lot to try to develop reset for preemption enforcement. The implementation > will call i915_handle_error from another work item that can run concurrent > with hangcheck. Note to hit it in practice is a nasty bug. The assumption is that between a pair of resets there was sufficient time for the engine to recover, and so if we reset too quickly we conclude that the reset/recovery mechanism is broken. And if you do start playing with fast resets, you very quickly find that kthread_park is a livelock waiting to happen. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx