On Wed, Oct 7, 2020 at 4:20 PM Marc Zyngier <maz@xxxxxxxxxx> wrote: > On 2020-10-07 14:10, Andy Shevchenko wrote: > > On Wed, Oct 7, 2020 at 3:09 PM Marc Zyngier <maz@xxxxxxxxxx> wrote: > >> On 2020-10-07 13:02, Andy Shevchenko wrote: > >> > On Wed, Oct 7, 2020 at 12:49 PM Linus Walleij > >> > <linus.walleij@xxxxxxxxxx> wrote: > >> >> On Mon, Oct 5, 2020 at 4:02 PM Marc Zyngier <maz@xxxxxxxxxx> wrote: > >> >> > >> >> > The pca953x driver never checks the result of irq_find_mapping(), > >> >> > which returns 0 when no mapping is found. When a spurious interrupt > >> >> > is delivered (which can happen under obscure circumstances), the > >> >> > kernel explodes as it still tries to handle the error code as > >> >> > a real interrupt. > >> >> > > >> >> > Handle this particular case and warn on spurious interrupts. > >> >> > > >> >> > Signed-off-by: Marc Zyngier <maz@xxxxxxxxxx> > >> > > >> > Wait, doesn't actually [1] fix the reported issue? > >> > >> Not at all. > >> > >> > Marc, can you confirm this? > >> > > >> > [1]: e43c26e12dd4 ("gpio: pca953x: Fix uninitialized pending variable") > >> > >> Different bug, really. If an interrupt is *really* pending, and no > >> mapping established yet, feeding the result of irq_find_mapping() to > >> handle_nested_irq() will lead to a panic. > > > > I don't understand. We have plenty of drivers doing exactly the way > > without checking this returned code. > > I'm sure we do. Most driver code is buggy as hell, but I don't see that > as a reason to cargo-cult the crap. The API is crystal clear that it can > return 0 for no mapping, and 0 isn't a valid interrupt. Yes, and the problem here is that we got this response from IRQ core, which we shouldn't. > > What circumstances makes the mapping be absent? > > Other bugs in the system ([1]), spurious interrupts (which can *always* > happen). > > > Shouldn't we rather change this: > > > > girq->handler = handle_simple_irq; > > to this: > > girq->handler = handle_bad_irq; > > ? > > I don't understand what you are trying to achieve with that, apart from > maybe breaking the driver. The right way to handle spurious interrupts > is by telling the core code that the interrupt wasn't handled, and to > let > the spurious interrupt code do its magic. handle_bad_irq() is exactly for handling spurious IRQs as far as we believe documentation. So, by default the driver assigns (should assign) handle_bad_irq() to all IRQs as a default handler. If, by any chance, we got it, we already have a proper handler in place. The read handler is assigned whenever the IRQ core is called to register it (by means of ->irq_set_type() callback). My understanding that GPIO IRQ drivers are designed (should be designed) in this way. The approach will make us sure that we don't have spurious interrupts with assigned handlers. > [1] https://lore.kernel.org/r/20201005111443.1390096-1-maz@xxxxxxxxxx -- With Best Regards, Andy Shevchenko