On 15.12.20 23:12, Thomas Gleixner wrote: > On Tue, Dec 15 2020 at 21:12, Enrico Weigelt wrote: >> On 09.12.20 00:01, Thomas Gleixner wrote: >>> 3) It's invoked from __handle_domain_irq() when the 'hwirq' which is >>> handed in by the caller does not resolve to a mapped Linux >>> interrupt which is pretty much the same as the x86 situation above >>> in #1, but it prints useless data. >>> >>> It prints 'irq' which is invalid but it does not print the really >>> interesting 'hwirq' which was handed in by the caller and did >>> not resolve. >> >> I wouldn't say the irq-nr isn't interesting. In my particular case it >> was quite what I've been looking for. But you're right, hwirq should >> also be printed. > > The number is _not_ interesting in this case. It's useless because the > function does: Oh, I've mixed up the cases - I only had the other one, down below. > irq = hwirq; > > if (lookup) > irq = find_mapping(hwirq); > > if (!irq || irq >= nr_irqs) > -> BAD When exactly can that happen ? Only when some hardware sending an IRQ, but no driver listening to it, or are there other cases ? By the way: who's supposed to call that function ? Only irqchip's (and the few soc specific 1st-level irq handlers) ? I'm asking, because we have lots of gpio drivers, which have their own irq domain, but go the generic_handle_irq() route. Same for some SOC-specific irqchips. Should they also call handle_domain_irq() instead ? > In both cases the only interesting information is that hwirq does not > resolve to a valid Linux interrupt number and which hwirq number caused > that. Don't we also need know which irqchip the hwirq number belongs to ? > If you look really then you find out that there is exactly _ONE_ > architecture which does anything else than incrementing a counter and/or > printing stuff: X86, which has a big fat comment explaining why. The > only way to ack an interrupt on X86 is to issue EOI on the local APIC, > i.e. it does _not_ need any further information. Yeah, found it :) At this point I wonder whether the ack_APIC_irq() call could be done somewhere further up in the call chain, eg. handle_irq() or common_interrupt() ? If that works, we IMHO could drop ack_bad_irq() completely (except for the counter and printk, which we could consolidate elsewhere anyways) >> ... rethinking this further ... shouldn't we also pass in even more data >> (eg. irq_desc, irqchip, ...), so this function can check which hw to >> actually talk to ? > > There are 3 ways to get there: > > 1) via dummy chip which obviously has no hardware associated ... which also calls print_irq_desc() .. > 2) via handle_bad_irq() which prints the info already print_irq_desc() doesn't seem to print the hwirq ... shall we fix this ? > 3) __handle_domain_irq() which cannot print anything and obviously > cannot figure out the hw to talk to because there is no irq > descriptor associated. Okay, what's the conclusion ? Drop printouts in the ack_bad_irq()'s ? >>> 4) It's invoked from the dummy irq chip which is installed for a >>> couple of truly virtual interrupts where the invocation of >>> dummy_irq_chip::irq_ack() is indicating wreckage. >>> >>> In that case the Linux irq number is the thing which is printed. >>> >>> So no. It's not just inconsistent it's in some places outright >>> wrong. What we really want is: >>> >>> ack_bad_irq(int hwirq, int virq) >> >> is 'int' correct here ? > > This was just for illustration. Okay, thanks. Just discovered already have an irq_hw_number_t, which doesn't seem to be used everywhere ... shall we fix that ? >> OTOH: since both callers (dummychip.c, handle.c) already dump out before >> ack_bad_irq(), do we need to print out anything at all ? > > Not all callers print something, but yes this could do with some general > cleanup. I've found three callers, only one (__handle_domain_irq() in irqdesc.c) doesn't print out anything. I belive, adding a pr_warn() here and drop all the printouts in ack_bad_irq()'s makes sense. > The error counter is independent of that, but yes there is room for > consolidation. Ok, I've already started hacking a bit here: adding an atomic_t counter in kernel/irq/handle.c and inline'd accessor functions in include/asm-generic/irq.h (just feeling that accessors are a bit cleaner than direct access). Would that be okay ? By the way: I still wonder whether my case should have ever reached ack_bad_irq(). The irqdescs had been allocated via devm_irq_alloc_descs(), and the driver just called generic_handle_irq() with base irq + gpio nr. So, IMHO it was a valid linux irq number, but no (explicit) handler. I wonder whether ack'ing those virtual irqs onto hw could be harmful. --mtx -- --- Hinweis: unverschlüsselte E-Mails können leicht abgehört und manipuliert werden ! Für eine vertrauliche Kommunikation senden Sie bitte ihren GPG/PGP-Schlüssel zu. --- Enrico Weigelt, metux IT consult Free software and Linux embedded engineering info@xxxxxxxxx -- +49-151-27565287