Quoting Jeff McGee (2017-08-28 21:18:44) > On Mon, Aug 28, 2017 at 08:44:48PM +0100, Chris Wilson wrote: > > Quoting jeff.mcgee@xxxxxxxxx (2017-08-28 20:25:30) > > > From: Jeff McGee <jeff.mcgee@xxxxxxxxx> > > > > > > If someone else is resetting the engine we should clear our own bit as > > > part of skipping that engine. Otherwise we will later believe that it > > > has not been reset successfully and then trigger full gpu reset. If the > > > other guy's reset actually fails, he will trigger the full gpu reset. > > > > The reason we did continue on to the global reset was to serialise > > i915_handle_error() with the other thread. Not a huge issue, but a > > reasonable property to keep -- and we definitely want a to explain why > > only one reset at a time is important. > > > > bool intel_engine_lock_reset() { > > if (!test_and_set_bit(I915_RESET_ENGINE + engine->id, > > &engine->i915->gpu_error.flags)) > > return true; > > > > intel_engine_wait_for_reset(engine); > The current code doesn't wait for the other thread to finish the reset, but > this would add that wait. Pardon? If we can't reset the engine, we go to the full reset which is serialised, both with individual engine resets and other globals. > Did you intend that as an additional change to > the current code? I don't think it is necessary. Each thread wants to > reset some subset of engines, so it seems the thread can safely exit as soon > as it knows each of those engines has been reset or is being reset as part > of another thread that got the lock first. If any of the threads fail to > reset an engine they "own", then full gpu reset is assured. It's unexpected for this function to return before the reset. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx