On Thu, May 23 2024 at 23:03, Vlastimil Babka wrote: > On 5/23/24 12:36 PM, Thomas Gleixner wrote: >>> ------------[ cut here ]------------ >>> DEBUG_LOCKS_WARN_ON(l->owner) >>> WARNING: CPU: 3 PID: 5221 at include/linux/local_lock_internal.h:30 local_lock_acquire include/linux/local_lock_internal.h:30 [inline] >>> WARNING: CPU: 3 PID: 5221 at include/linux/local_lock_internal.h:30 flush_slab mm/slub.c:3088 [inline] >>> WARNING: CPU: 3 PID: 5221 at include/linux/local_lock_internal.h:30 flush_cpu_slab+0x37f/0x410 mm/slub.c:3146 > > I'm puzzled by this. We use local_lock_irqsave() on !PREEMPT_RT everywhere. > IIUC this warning says we did the irqsave() and then found out somebody else > already set the owner? But that means they also did that irqsave() and set > themselves as l->owner. Does that mey there would be a spurious irq enable > that didn't go through local_unlock_irqrestore()? > > Also this particular stack is from the work, which is scheduled by > queue_work_on() in flush_all_cpus_locked(), which also has a > lockdep_assert_cpus_held() so it should fullfill the "the caller must ensure > the cpu doesn't go away" property. But I think even if this ended up on the > wrong cpu (for the full duration or migrated while processing the work item) > somehow, it wouldn't be able to cause such warning, but rather corrupt > something else Indeed. There is another report which makes no sense either: https://lore.kernel.org/lkml/000000000000fa09d906191c3ee5@xxxxxxxxxx Both look like data corropution issues caused by whatever... Thanks, tglx