Re: Q: Possible reason(s) for BUG in rt_spin_lock_slowlock_locked()

Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> · Fri, 19 Nov 2021 15:19:11 +0100

On 2021-11-19 13:45:49 [+0000], Andreas Glatz wrote:
> > > > In the jtag debugger I see that at the same time the other two
> > > > kthreads are waiting on the spinlock that's held by the IRQ thread
> > > > that died on two of the four CPU cores of the i.MX6q.
> > > >
> > > > Any ideas what might cause this and how to fix it?
> > >
> > > So the lock owner exploded in BUG_ON() and every lock attempt will fail
> > > since the slow-path is forced and the wait_lock is still acquired.
> > >
> > > The BUG_ON() statement suggest that the thread is enqueued as waiter but
> > > shouldn't since it obtained the lock. From your backtrace:
> >
> > Right... any idea for investigating why this might be? I assume a
> > particular IRQ thread should be unique in the system? Maybe it didn't
> > release the lock the last time it ran?
> 
> I found an instance where we did not unlock the spinlock before
> returning from a function :( I'll test again...

Try to enable LOCKDEP with LOCK-proving, schedule-while-atomic and such.
Lockdep should warn you if you attempt return to userland with locks
held.

> > > The confusing part is that you use sleeping locks but the banner says
> > > PREEMPT instead of PREEMPT_RT.
> > > Any chance that you don't have PREEMPT_RT_FULL enabled?
> >
> > I just checked the .config as well as /proc/version and it seems to be
> > enabled... so yes, this is strange - thanks for pointing this out.
> >
> > # cat /proc/version
> > Linux version 4.19.100-rt41 (oe-user@oe-host) (gcc version 8.3.0
> > (GCC)) #1 SMP PREEMPT RT Mon Nov 1 15:30:04 UTC 2021

If that is there, maybe it is missing the ARM specific output…

> > > > Many thanks and regards,
> > > >
> > > > Andreas

Sebastian