Re: [PATCH] drm/i915/gt: Add a delay to let engine resumes correctly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Nitin,

On Wed, Mar 05, 2025 at 07:45:31AM +0000, Gote, Nitin R wrote:
> > On Mon, Feb 24, 2025 at 12:01:04PM +0530, Nitin Gote wrote:
> > > Sometimes engine reset fails because the engine resumes from an
> > > incorrect RING_HEAD. Engine head failed to set to zero even after
> > > writing into it. This is a timing issue and we experimented different
> > > values and found out that 20ms delay works best based on testing.
> > >
> > > So, add a 20ms delay to let engine resumes from correct RING_HEAD.
> > >
> > > Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13650
> > > Signed-off-by: Nitin Gote <nitin.r.gote@xxxxxxxxx>
> > > ---
> > >  drivers/gpu/drm/i915/gt/intel_ring_submission.c | 7 +++++++
> > >  1 file changed, 7 insertions(+)
> > >
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_ring_submission.c
> > > b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
> > > index 6e9977b2d180..5576f000e965 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_ring_submission.c
> > > +++ b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
> > > @@ -365,6 +365,13 @@ static void reset_prepare(struct intel_engine_cs
> > *engine)
> > >  			     ENGINE_READ_FW(engine, RING_HEAD),
> > >  			     ENGINE_READ_FW(engine, RING_TAIL),
> > >  			     ENGINE_READ_FW(engine, RING_START));
> > > +		/*
> > > +		 * Sometimes engine head failed to set to zero even after writing
> > into it.
> > > +		 * Use 20ms delay to let engine resumes from correct
> > RING_HEAD.
> > > +		 * Experimented different values and determined that 20ms
> > works best
> > > +		 * based on testing.
> > > +		 */
> > > +		mdelay(20);
> > 
> > Is there any extremely strong reason for using mdelay here, rather than any other
> > delay function?
> > 
> > Andi
> 
> Yes. Firstly I checked with udelay(20000) and while testing a test for 1000 times, 
> a couple of times got an issue of "BUG: scheduling while atomic: i915_selftest/10313/0x00000201" from the scheduler.
> Adding here a failure stack trace in case you want to take a look.
> 
> And that's why I used mdelay(20), where I have not seen this issue. I have tested with mdelay(20), thousands of times and it worked.

it's not a good reason for using mdelay. We would be very bad
citizens for wanting to use mdelay for such a long time.

mdelay keeps busy waiting and gets one core stuck for just this
purpose. This is a straight nack.

can you try with fsleep?

Thanks,
Andi



[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux