On Mon, 11 Mar 2024, Michael Schmitz wrote:
Looping over 178 boots (using init=/sbin/reboot), I see eight of the spinlock recursion messages in ARAnyM on my old PowerBook G4: BUG: spinlock recursion on CPU#0, swapper/1 BUG: spinlock recursion on CPU#0, swapper/1 BUG: spinlock recursion on CPU#0, pool_workqueue_/3 BUG: spinlock recursion on CPU#0, swapper/2 BUG: spinlock recursion on CPU#0, pool_workqueue_/3 BUG: spinlock recursion on CPU#0, pool_workqueue_/3 BUG: spinlock recursion on CPU#0, swapper/2 BUG: spinlock recursion on CPU#0, pool_workqueue_/3
Not the reliable reproducer I was hoping for but it is progress. We now know the problem shows up in both Aranym and Qemu.
Trying the same on a much faster Intel system, no messages are seen. I'll try locking the PowerBook on half CPU clock rate next. ... The tests on unlocking certainly aren't atomic, but those are not the ones we see in the messages. The test on locking use READ_ONCE() so ought to be safe. The locking primitives are not atomic at all, by design ('No atomicity anywhere, we are on UP'. While not debugging, spinlocks are NOPs on UP.)
I think spin_lock() reduces to preempt_disable() on UP. In include/linux/spinlock_api_up.h it says, /* * In the UP-nondebug case there's no real locking going on, so the * only thing we have to do is to keep the preempt counts and irq * flags straight, to suppress compiler warnings of unused lock * variables, and to add the proper checker annotations: */
I wonder whether CONFIG_DEBUG_SPINLOCK was ever meant to work at all on UP?
I've no idea, sorry. The people who would be able to help would be found in MAINTAINERS in the "LOCKING PRIMITIVES" section.