On 2021-11-19 10:39:20 [+0000], Andreas Glatz wrote: > Hi Hi, > I patched 4.19.100 with rt41 patch set and we ported the Micrel > ksz8462_h Ethernet driver. The driver has one threaded IRQ triggered > by the interrupt from the ksz8462 chip and two workers, one for > gathering MIBs and one for checking the link status. Everything > seemingly ran ok for quite some time. However, yesterday I noticed > that the IRQ thread died in rt_spin_lock_slowlock_locked() as per > stacktrace below at: > > 0xc0af82f8 is in rt_spin_lock_slowlock_locked > (/usr/src/kernel/kernel/locking/rtmutex.c:1105). > 1100 * unconditionally. We might have to fix that up: > 1101 */ > 1102 fixup_rt_mutex_waiters(lock); > 1103 > 1104 BUG_ON(rt_mutex_has_waiters(lock) && waiter == rt_mutex_top_waiter(lock)); > 1105 BUG_ON(!RB_EMPTY_NODE(&waiter->tree_entry)); > 1106 } > 1107 > 1108 static void noinline __sched rt_spin_lock_slowlock(struct rt_mutex *lock) > 1109 { > > In the jtag debugger I see that at the same time the other two > kthreads are waiting on the spinlock that's held by the IRQ thread > that died on two of the four CPU cores of the i.MX6q. > > Any ideas what might cause this and how to fix it? So the lock owner exploded in BUG_ON() and every lock attempt will fail since the slow-path is forced and the wait_lock is still acquired. The BUG_ON() statement suggest that the thread is enqueued as waiter but shouldn't since it obtained the lock. From your backtrace: | Internal error: Oops - undefined instruction: 0 [#1] PREEMPT SMP ARM … | CPU: 0 PID: 1457 Comm: irq/77-ksz8462_ Tainted: G W O 4.19.100-rt41 #1 … | Process irq/77-ksz8462_ (pid: 1457, stack limit = 0x968e9d88) | [<c0af82f8>] (rt_spin_lock_slowlock_locked) from [<c0af8384>] (rt_spin_lock_slowlock+0x64/0x94) | [<c0af8384>] (rt_spin_lock_slowlock) from [<c0afab28>] (rt_spin_lock+0x7c/0x84) | [<c0afab28>] (rt_spin_lock) from [<bf1c4418>] (ks_irq+0x48/0x540 [ksz8462_h]) | [<bf1c4418>] (ks_irq [ksz8462_h]) from [<c01933f0>] (irq_forced_thread_fn+0x30/0xa8) The confusing part is that you use sleeping locks but the banner says PREEMPT instead of PREEMPT_RT. Any chance that you don't have PREEMPT_RT_FULL enabled? > Many thanks and regards, > > Andreas > Sebastian