On Thu, 2024-10-31 at 11:40 +0100, Uwe Kleine-König wrote: > Hello, > > On Wed, Oct 30, 2024 at 08:44:29PM +0000, Jonathan Cameron wrote: > > On Wed, 30 Oct 2024 14:04:58 +0100 > > Nuno Sá <noname.nuno@xxxxxxxxx> wrote: > > > > > On Mon, 2024-10-28 at 17:07 +0100, Uwe Kleine-König wrote: > > > > Some of the ADCs by Analog signal their irq condition on the MISO line. > > > > So typically that line is connected to an SPI controller and a GPIO. The > > > > GPIO is used as input and the respective interrupt is enabled when the > > > > last SPI transfer is completed. > > > > > > > > Depending on the GPIO controller the toggling MISO line might make the > > > > interrupt pending even while it's masked. In that case the irq handler > > > > is called immediately after irq_enable() and so before the device > > > > actually pulls that line low which results in non-sense values being > > > > reported to the upper layers. > > > > > > > > The only way to find out if the line was actually pulled low is to read > > > > the GPIO. (There is a flag in AD7124's status register that also signals > > > > if an interrupt was asserted, but reading that register toggles the MISO > > > > line and so might trigger another spurious interrupt.) > > > > > > > > Add the possibility to specify an interrupt GPIO in the machine > > > > description instead of a plain interrupt. This GPIO is used as interrupt > > > > source and to check if the irq line is actually active in the irq > > > > handler. > > > > > > > > Signed-off-by: Uwe Kleine-König <u.kleine-koenig@xxxxxxxxxxxx> > > > > --- > > > > > > Hi all, > > > > > > Regarding this, I do share some of the concerns already raised by Jonathan. I > > > fear > > > that we're papering around an issue with the IRQ controller rather than being > > > an > > > issue with the device. When I look at irq_disable() docs [1], it feels that > > > we're > > > already doing what we're supposed to do. IOW, we disable the lazy approach so > > > we > > > *should* not get any pending IRQ. > > I think this is wrong and you always have to be prepared to see an irq > triggering that became pending while masked. > > > > Also looking at drivers as the xilinx gpio controller, it seems some > > > are careful about this [2] and make sure to clear all pending IRQs > > > when unmasking. > > Your links are both to the same place. > > The right one is: > https://elixir.bootlin.com/linux/v6.11.5/source/drivers/gpio/gpio-xilinx.c#L419 > > I think this is buggy, see below for the reasoning. > > > > Jonathan also said this: > > > > > > "True enough - that race is a possibility, but not all interrupt inputs > > > are capable of gpio usage whilst setup to received interrupts." > > Race should be easy to avoid using a level interrupt now I think more on that: > > can't miss a level. > > In general this isn't true. If it were that easy we could just assume > all irqs being level interrupts and simplify the irq code a bit. At > least for the ad7124 if a conversion is done, the chip holds the line > low until the next conversion is done. In that case it deasserts > DOUT/̅R̅D̅Y for a short while to create another falling edge signalling > another event. I can imagine this to confuse level detection?! > > > > To my understanding this also means this is doomed to fail for some devices or > > > am I > > > not following it? > > > > If you were wired to one of those, you couldn't use the GPIO trick, but then > > don't have a GPIO in your DT in that case. > > Yes. If the device isn't properly connected in hardware you're out of > luck. But that is also true if the spi clock line isn't connected. So > apart from the requirement that "properly" involves things that are > unusual for other SPI devices, that's expected. Having said that it was > clear before because the MISO (aka DOUT/̅R̅D̅Y) line was already know to have > to be connected to an irq capable pin. > > > > All that said, my naive feeling would be for a masked line to not get any > > > pending IRQ > > > and if it does, the driver should make sure to clean all outstanding interrupts > > > when > > > unmasking. But I'm far from being an expert of the IRQ subsystem. Maybe it > > > would be > > > interesting to get some inputs about someone who actually knows better? > > +CC Thomas Gleixner, > > > > Annoying case where a wire is both the interrupt source for dataready and the > > SPI data line (if separate clock signal is toggling) So currently the driver > > masks interrupts at the host end, but we have at least one interrupt controller > > where they end up pending and fire on reenabling the interrupt. Querying the > > device to check the status register then ends up causing it to happen again, > > so that doesn't help. > > > > Proposal is to query it as a GPIO (or maybe a separate GPIO wired to the same > > pin) to check the level when an interrupt comes in. > > In my understanding it's the expected behaviour of an irq controller > that a masked irq becomes pending if the irq event (level or edge) > happens and then triggers immediately after enable_irq() -- independent > of laziness being used or not. > I'm really not sure about that. If a consumer disables/masks an interrupt, then I would think it expects no interrupts. If one comes during that time, it seems reasonable to me that the IRQ is discarded. And if the expected behavior is to have a pending IRQ if we got it while masked, then I'm not sure why we have the UNLAZY thing. I mean, let's always do the lazy approach which is effectively only masking the line if we get an IRQ while disabled (and also mark it as pending). If both approaches result in a pending IRQ... But again, not sure what are the expectations here and far from an expert on the IRQ subsystem. > That's also the only way to prevent missing interrupts. To understand > that consider an irq that signals there is data in a fifo. The handler > reads that data out and when it's empty returns. Returning results in > unmasking the interrupt again. If new data arrives between seeing the > fifo empty and reenabling the irq you miss the event if the irq doesn't > become pending while it's masked. > > I could argue that's the old issue of not handling the IRQs fast enough in the handler in which case some other alternatives would have to be implemented... Anyways, I guess it can go both ways. Because it can also be plausible that if one IRQ came while the line is masked then the consumer does not really care about it and if we get it right after enable_irq(), we get a spurious interrupt (which is the case here). But well, in the end you might be right in the sense that putting this awareness into the irqchip is asking for other issues. And for devices that cannot disable the IRQ at the device level and rely on disable_irq(), we need to handle it at the device level. - Nuno Sá >