Hello Nuno, On Mon, Nov 04, 2024 at 02:15:49PM +0100, Nuno Sá wrote: > On Mon, 2024-11-04 at 13:49 +0100, Uwe Kleine-König wrote: > > On Thu, Oct 31, 2024 at 01:05:21PM +0100, Nuno Sá wrote: > > > On Thu, 2024-10-31 at 11:40 +0100, Uwe Kleine-König wrote: > > > > On Wed, Oct 30, 2024 at 08:44:29PM +0000, Jonathan Cameron wrote: > > > > > On Wed, 30 Oct 2024 14:04:58 +0100 > > > > > Nuno Sá <noname.nuno@xxxxxxxxx> wrote: > > > > > > Regarding this, I do share some of the concerns already raised by Jonathan. > > > > > > I fear > > > > > > that we're papering around an issue with the IRQ controller rather than > > > > > > being an > > > > > > issue with the device. When I look at irq_disable() docs [1], it feels that > > > > > > we're > > > > > > already doing what we're supposed to do. IOW, we disable the lazy approach > > > > > > so we > > > > > > *should* not get any pending IRQ. > > > > > > > > I think this is wrong and you always have to be prepared to see an irq > > > > triggering that became pending while masked. > > > > I did some research, here are my findings: > > > > https://www.kernel.org/doc/html/v6.12-rc6/core-api/genericirq.html#delayed-interrupt-disable > > reads: > > > > The interrupt is kept enabled and is masked in the flow handler > > when an interrupt event happens. This prevents losing edge > > interrupts on hardware which does not store an edge interrupt > > event while the interrupt is disabled at the hardware level. > > > > This suggests that lazy disabling is needed for some controllers that > > stop their event detection when disabled. I read that as: *Normally* an > > irq event gets pending in hardware while the irq is disabled. > > I might be wrong, but I think that the lazy approach is the one for optimization. It's both. Needed for some controllers *and* an optimisation (that isn't beneficial in some corner cases). > I think the reasoning is that __normally__ no more IRQs will come so > no need to access the HW. But also thinking more on the subject and > looking at what the lazy approach is doing, I take back what I said in > previous emails. I *think* the expectation for a received IRQ while > the line is masked (or disabled?!), is to keep it as pending (both on > HW - when possible - and in SW). > > > The lazy disable approach is expected to work fine always, the reason to > > implement non-lazy disabling is "only" a performance optimisation. See > > commit e9849777d0e27cdd2902805be51da73e7c79578c. > > Not sure If I understood you correctly, but I think is the other way around? > Also, as said in the commit, I think it also prevents the same interrupt from > happening twice (in some cases). The conversation thread isn't complete on lore.kernel.org, so I don't know for sure, but the way I understand it is: Normally while you handle an irq no new irq comes in and so it's sensible to do lazy disabling. Approximately: In 99.9 % of the cases you save 1 µs by not masking and in the remaining 0.1% you get another hard irq that costs you say 500 µs. So on average you save: 0.999 * 1 µs + 0.001 * (1 - 500) µs = 0.5 µs. However if for a certain device it's normal that another irq comes in the "improvement" degrades to: In 20 % of the cases you save 1 µs and in the remaining 80 % you get a penalty of 500 µs. So in this case it's not an expected win anymore and you can better stop doing lazy disabling and invest the time to mask the irq improving the average cost from - 0.2 * 1 µs + 0.8 * 499 µs = 399.4 µs to 1 µs. The interrupt happening twice is not a problem for correctness as the second one is not given to the device driver but caught in the irq subsystem that only then disables the irq in hardware and marking it pending for later consumption. It "only" costs cpu time. (And maybe it's given to the driver twice after enable_irq is called?) Looking at the problem at hand with the shared DOUT/̅R̅D̅Y line: It's (nearly?) 100% of the cases that the line toggles while the irq is logically disabled/masked. So it makes sense to do irq_set_status_flags(sigma_delta->irq_line, IRQ_DISABLE_UNLAZY); which however should only have an effect iff the irq controller doesn't miss an edge while the irq is disabled. So assuming my understanding is correct, there is something to do about the raspberry pi gpio controller to prevent setting IRQ_DISABLE_UNLAZY have an effect, because that one looses events. > > However that makes me wonder what is the difference between the > > irq_mask() and irq_disable() callbacks defined in struct irq_chip. > > Wondering the same... > > Thanks for digging into this. This has been a long standing thing with sigma delta > ADCs (I'm fairly sure this discussion about being an issue on the IRQ controller or > not already happened before). I keep that paragraph in my reply because the question is still open even though I don't add new infos here. Best regards Uwe
Attachment:
signature.asc
Description: PGP signature