Hi Nitin, On Wed, Mar 05, 2025 at 07:45:31AM +0000, Gote, Nitin R wrote: > > On Mon, Feb 24, 2025 at 12:01:04PM +0530, Nitin Gote wrote: > > > Sometimes engine reset fails because the engine resumes from an > > > incorrect RING_HEAD. Engine head failed to set to zero even after > > > writing into it. This is a timing issue and we experimented different > > > values and found out that 20ms delay works best based on testing. > > > > > > So, add a 20ms delay to let engine resumes from correct RING_HEAD. > > > > > > Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13650 > > > Signed-off-by: Nitin Gote <nitin.r.gote@xxxxxxxxx> > > > --- > > > drivers/gpu/drm/i915/gt/intel_ring_submission.c | 7 +++++++ > > > 1 file changed, 7 insertions(+) > > > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_ring_submission.c > > > b/drivers/gpu/drm/i915/gt/intel_ring_submission.c > > > index 6e9977b2d180..5576f000e965 100644 > > > --- a/drivers/gpu/drm/i915/gt/intel_ring_submission.c > > > +++ b/drivers/gpu/drm/i915/gt/intel_ring_submission.c > > > @@ -365,6 +365,13 @@ static void reset_prepare(struct intel_engine_cs > > *engine) > > > ENGINE_READ_FW(engine, RING_HEAD), > > > ENGINE_READ_FW(engine, RING_TAIL), > > > ENGINE_READ_FW(engine, RING_START)); > > > + /* > > > + * Sometimes engine head failed to set to zero even after writing > > into it. > > > + * Use 20ms delay to let engine resumes from correct > > RING_HEAD. > > > + * Experimented different values and determined that 20ms > > works best > > > + * based on testing. > > > + */ > > > + mdelay(20); > > > > Is there any extremely strong reason for using mdelay here, rather than any other > > delay function? > > > > Andi > > Yes. Firstly I checked with udelay(20000) and while testing a test for 1000 times, > a couple of times got an issue of "BUG: scheduling while atomic: i915_selftest/10313/0x00000201" from the scheduler. > Adding here a failure stack trace in case you want to take a look. > > And that's why I used mdelay(20), where I have not seen this issue. I have tested with mdelay(20), thousands of times and it worked. it's not a good reason for using mdelay. We would be very bad citizens for wanting to use mdelay for such a long time. mdelay keeps busy waiting and gets one core stuck for just this purpose. This is a straight nack. can you try with fsleep? Thanks, Andi