On Tue, May 10, 2022 at 02:13:20PM +0200, Lukas Wunner wrote: > On Mon, May 09, 2022 at 09:54:21AM +0100, Mark Rutland wrote: > > On Fri, May 06, 2022 at 10:32:42PM +0200, Lukas Wunner wrote: > > > On Tue, Oct 26, 2021 at 10:25:04AM +0100, Mark Rutland wrote: > > > > int generic_handle_domain_irq(struct irq_domain *domain, unsigned int hwirq) > > > > { > > > > + WARN_ON_ONCE(!in_irq()); > > > > return handle_irq_desc(irq_resolve_mapping(domain, hwirq)); > > > > } > > > > EXPORT_SYMBOL_GPL(generic_handle_domain_irq); > > > > > > Why isn't the WARN_ON_ONCE() conditional on handle_enforce_irqctx()? > > > (See handle_irq_desc() and c16816acd086.) > > > > I did this for consistency with the in_nmi() check in > > generic_handle_domain_nmi(); I was unaware of commit c16816acd086 and > > IRQD_HANDLE_ENFORCE_IRQCTX. > > Actually, since you're mentioning the in_nmi() check, I suspect > there's another problem here: > > generic_handle_domain_nmi() warns if !in_nmi(), then calls down > to handle_irq_desc() which warns if !in_hardirq(). Doesn't this > cause a false-positive !in_hardirq() warning for a NMI on GIC/GICv3? I agree that doesn't look right. > The only driver calling request_nmi() or request_percpu_nmi() is > drivers/perf/arm_pmu.c. So that's the only one affected. > You may want to test if that driver indeed exhibits such a > false-positive warning since c16816acd086. In testing with v5.18-rc5, I can't see that going wrong. I also hacked the following in: -------->8-------- diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c index 939d21cd55c38..3c85608a8779f 100644 --- a/kernel/irq/irqdesc.c +++ b/kernel/irq/irqdesc.c @@ -718,6 +718,7 @@ EXPORT_SYMBOL_GPL(generic_handle_domain_irq); int generic_handle_domain_nmi(struct irq_domain *domain, unsigned int hwirq) { WARN_ON_ONCE(!in_nmi()); + WARN_ON_ONCE(!in_hardirq()); return handle_irq_desc(irq_resolve_mapping(domain, hwirq)); } #endif -------->8-------- ... and that never fired, which doesn't seem right. I also see some other problems which may or may not be related (lockdep tracking gone wrong, which could indicate a more general failure to track irq state correctly). I'll go dig into that... > > > I believe the above change causes a regression in drivers/gpio/gpio-dln2.c > > > such that a gratuitous WARN splat is now emitted. > > > > Do you mean you beleive that from inspection, or are you actually seeing this > > go wrong in practice? > > > > If it's the latter, are you able to give a copy of the dmesg splat? > > For gpio-dln2.c, I believe it from inspection. > > For smsc95xx.c, I'm actually seeing it go wrong in practice, > unedited dmesg splat is included below FWIW. Thanks; having the trace makes this much easier to analyse. Mark. > That's with the following series applied on top of current net-next: > https://lore.kernel.org/netdev/cover.1651574194.git.lukas@xxxxxxxxx/ > > You need to remove the __irq_enter_raw() in patch [5/7] of that series > to actually see the WARN splat. I used it to work around your > WARN_ON_ONCE() and that's what Marc took exception to. > > With gpio-dln2.c, the call stack looks like this: > > dln2_rx() # drivers/mfd/dln2.c > dln2_run_event_callbacks() > dln2_gpio_event() # drivers/gpio/gpio-dln2.c > generic_handle_domain_irq() > > That's basically the same as the callstack for smsc95xx.c below. > Interrupts are disabled in dln2_rx() via spin_lock_irqsave(), > but there isn't an __irq_enter_raw() anywhere to be seen > and that would be necessary to avoid the false-positive > WARN splat in generic_handle_domain_irq(). > > Thanks, > > Lukas > > -- >8 -- > > [ 1227.718928] WARNING: CPU: 3 PID: 75 at kernel/irq/irqdesc.c:702 generic_handle_domain_irq+0x88/0x94 > [ 1227.718951] Modules linked in: sha256_generic cfg80211 rfkill 8021q garp stp llc ax88796b asix raspberrypi_hwmon bcm2835_codec(C) bcm2835_v4l2(C) bcm2835_isp(C) v4l2_mem2mem bcm2835_mmal_vchiq(C) snd_bcm2835(C) vc_sm_cma(C) videobuf2_dma_contig videobuf2_vmalloc snd_pcm videobuf2_memops videobuf2_v4l2 videobuf2_common snd_timer snd videodev mc micrel ks8851_spi ks8851_common uio_pdrv_genirq uio eeprom_93cx6 piControl(O) ad5446 ti_dac082s085 mcp320x iio_mux mux_gpio mux_core fixed gpio_74x164 spi_bcm2835aux spi_bcm2835 gpio_max3191x crc8 industrialio i2c_dev ip_tables x_tables ipv6 > [ 1227.719076] CPU: 3 PID: 75 Comm: irq/89-dwc_otg_ Tainted: G C O 5.17.0-rt15-v7+ #2 > [ 1227.719084] Hardware name: BCM2835 > [ 1227.719087] Backtrace: > [ 1227.719091] dump_backtrace from show_stack+0x20/0x24 > [ 1227.719106] r7:00000009 r6:00000080 r5:80d25844 r4:60000093 > [ 1227.719109] show_stack from dump_stack_lvl+0x74/0x9c > [ 1227.719120] dump_stack_lvl from dump_stack+0x18/0x1c > [ 1227.719134] r7:00000009 r6:801957f4 r5:000002be r4:80d1e8ac > [ 1227.719137] dump_stack from __warn+0xdc/0x190 > [ 1227.719148] __warn from warn_slowpath_fmt+0x70/0xcc > [ 1227.719161] r8:00000009 r7:801957f4 r6:000002be r5:80d1e8ac r4:00000000 > [ 1227.719163] warn_slowpath_fmt from generic_handle_domain_irq+0x88/0x94 > [ 1227.719177] r8:81e87600 r7:00000000 r6:81d28000 r5:00000000 r4:81d4be00 > [ 1227.719179] generic_handle_domain_irq from smsc95xx_status+0x54/0xb0 > [ 1227.719195] r7:00000000 r6:00008000 r5:81f28640 r4:60000013 > [ 1227.719197] smsc95xx_status from intr_complete+0x80/0x84 > [ 1227.719213] r9:81d40a00 r8:00000001 r7:00000000 r6:00000000 r5:81f28640 r4:81d4bd80 > [ 1227.719216] intr_complete from __usb_hcd_giveback_urb+0xa4/0x12c > [ 1227.719229] r5:815b3000 r4:81d4bd80 > [ 1227.719232] __usb_hcd_giveback_urb from usb_hcd_giveback_urb+0x118/0x11c > [ 1227.719245] r7:811498e0 r6:81d4bd80 r5:81682000 r4:815b3000 > [ 1227.719248] usb_hcd_giveback_urb from completion_tasklet_func+0x7c/0xc8 > [ 1227.719262] r7:811498e0 r6:81d4bd80 r5:81682000 r4:8639f800 > [ 1227.719264] completion_tasklet_func from tasklet_callback+0x20/0x24 > [ 1227.719277] r6:80f8d140 r5:b6b5c330 r4:00000000 > [ 1227.719279] tasklet_callback from tasklet_action_common.constprop.0+0x148/0x220 > [ 1227.719288] tasklet_action_common.constprop.0 from tasklet_hi_action+0x28/0x30 > [ 1227.719301] r10:81d28000 r9:00000001 r8:00000000 r7:811498e0 r6:00000000 r5:00000001 > [ 1227.719304] r4:81003080 > [ 1227.719306] tasklet_hi_action from __do_softirq+0x154/0x3e8 > [ 1227.719316] __do_softirq from __local_bh_enable_ip+0x12c/0x1a8 > [ 1227.719328] r10:801970b0 r9:80f85320 r8:00000001 r7:00000000 r6:00000001 r5:00000100 > [ 1227.719332] r4:40000013 > [ 1227.719334] __local_bh_enable_ip from irq_forced_thread_fn+0x7c/0xac > [ 1227.719347] r9:81d40ac0 r8:814f3c00 r7:00000001 r6:00000001 r5:814f3c00 r4:81d40ac0 > [ 1227.719350] irq_forced_thread_fn from irq_thread+0x16c/0x228 > [ 1227.719363] r7:00000001 r6:81d40ae4 r5:81d28000 r4:00000000 > [ 1227.719365] irq_thread from kthread+0x100/0x140 > [ 1227.719380] r10:00000000 r9:8154fbfc r8:81d43000 r7:81d40ac0 r6:80196dcc r5:81d40b00 > [ 1227.719383] r4:81d28000 > [ 1227.719386] kthread from ret_from_fork+0x14/0x34 > [ 1227.719394] Exception stack(0x81777fb0 to 0x81777ff8) > [ 1227.719401] 7fa0: 00000000 00000000 00000000 00000000 > [ 1227.719408] 7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 > [ 1227.719414] 7fe0: 00000000 00000000 00000000 00000000 00000013 00000000 > [ 1227.719422] r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:8014a6c8 r4:81d40b00 > [ 1227.719425] ---[ end trace 0000000000000000 ]---