On Fri, Dec 16, 2022 at 10:44:06AM +0900, Damien Le Moal wrote: > The original & complete lockdep splat is in the report email here: > > https://marc.info/?l=linux-ide&m=167094379710177&w=2 > > It looks like a spinlock is taken for the fasync stuff without irq > disabled and that same spinlock is needed in kill_fasync() which is > itself called (potentially) with IRQ disabled. Hence the splat. In any > case, that is how I understand the issue. But as mentioned above, given > that I can see many drivers calling kill_fasync() with irq disabled, I > wonder if this is a genuine potential problem or a false negative. OK, I'm about to fall asleep, so I might very well be missing something obvious, but... CPU1: ptrace(2) ptrace_check_attach() read_lock(&tasklist_lock); CPU2: setpgid(2) write_lock_irq(&tasklist_lock); spins CPU1: takes an interrupt that would call kill_fasync(). grep and the first instance of kill_fasync() is in hpet_interrupt() - it's not something exotic. IRQs disabled on CPU2 won't stop it. kill_fasync(..., SIGIO, ...) kill_fasync_rcu() read_lock_irqsave(&fa->fa_lock, flags); send_sigio() read_lock_irqsave(&fown->lock, flags); read_lock(&tasklist_lock); ... and CPU1 spins as well. It's not a matter of kill_fasync() called with IRQs disabled; the problem is kill_fasync() called from interrupt taken while holding tasklist_lock at least shared. Somebody trying to grab it on another CPU exclusive before we get to send_sigio() from kill_fasync() will end up spinning and will make us spin as well. I really hope that's just me not seeing something obvious - we had kill_fasync() called in IRQ handlers since way back and we had tasklist_lock taken shared without disabling IRQs for just as long. <goes to sleep, hoping to find "Al, you are a moron, it's obviously OK for such and such reasons" in the mailbox tomorrow morning>