* Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > I'm pretty sure that the thing that triggered this is once more commit > a9b4f08770b4 ("x86/ioapic: Restore IO-APIC irq_chip retrigger > callback") which seems to retrigger stale irqs that simply should not > be retriggered. > > They aren't actually active any more, if they ever were. > > So that commit seems to act like a random CONFIG_DEBUG_SHIRQ. It's > good for testing, but not good for actual users. Yeah, so some distros like Fedora already have CONFIG_DEBUG_SHIRQ=y enabled, but part of the problem is that CONFIG_DEBUG_SHIRQ=y has this: #ifdef CONFIG_DEBUG_SHIRQ_FIXME if (!retval && (irqflags & IRQF_SHARED)) { /* * It's a shared IRQ -- the driver ought to be prepared for it * to happen immediately, so let's make sure.... * We disable the irq to make sure that a 'real' IRQ doesn't * run in parallel with our fake. */ unsigned long flags; disable_irq(irq); local_irq_save(flags); handler(irq, dev_id); local_irq_restore(flags); enable_irq(irq); } #endif Note that the '_FIXME' postfix effectively turns off this particular debug check ... Thomas and me realized this risk a week ago ago, and tried to resurrect full CONFIG_DEBUG_SHIRQ=y functionality to more reliably trigger these problems: https://lkml.org/lkml/2017/2/15/341 ... but were forced to revert that fix because it's not working on x86 yet (it's crashing). We also thought we fixed the problems exposed in drivers, as the retrigger changes have been in -tip and -next for some time, but were clearly too optimistic about that. So, should we revert the hw-retrigger change: a9b4f08770b4 x86/ioapic: Restore IO-APIC irq_chip retrigger callback ... until we managed to fix CONFIG_DEBUG_SHIRQ=y? If you'd like to revert it upstream straight away: Acked-by: Ingo Molnar <mingo@xxxxxxxxxx> Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html