Quoting Chris Wilson (2019-09-07 09:39:52) > Quoting Daniele Ceraolo Spurio (2019-09-06 23:28:05) > > > > > > On 9/5/19 2:09 AM, Janusz Krzysztofik wrote: > > > When trying to reset a device with reset capability disabled or not > > > supported while rings are full of requests, it has been observed when > > > running in execlists submission mode that command stream buffer tail > > > tends to be incremented by apparently still running GPU regardless of > > > all requests being already cancelled and command stream buffer pointers > > > reset. As a result, kernel panic on NULL pointer dereference occurs > > > when a trace_ports() helper is called with command stream buffer tail > > > incremented but request pointers being NULL during final > > > __intel_gt_set_wedged() operation called from intel_gt_reset(). > > > > > > Skip actual reset procedure if reset is disabled or not supported. > > > > This last sentence is a bit confusing. You're not skipping the reset > > procedure, you're skipping the attempt of unwedging and resetting again > > after a reset & wedge already happened. > > Loss of email over the last week, so jumping in at the end. My gut > response is that this is still just papering over the bug, as what you > say above makes no sense. So my gut response was to the run on sentence, when all you needed to say that without a successful reset prior to calling reset_default_submission, the engine may still generate CS events out of the blue. And I think the patch should be written to require the successful reset. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx