Re: dwc2: irq 66: nobody cared triggered on resume

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Am 23.06.24 um 15:27 schrieb Stefan Wahren:
Hello Lukas,

Am 22.06.24 um 20:47 schrieb Lukas Wunner:
On Sat, Jun 22, 2024 at 02:23:33PM +0200, Stefan Wahren wrote:
i currently experiment with suspend to idle on the Raspberry Pi 3 A+.
Supend & resume works expected as long as no USB device is connected to
the board. If i connect a USB hub to the Pi, the resume phase is
significantly delayed and the kernel disabled IRQ 66 which belongs
to DWC2.
[...]
[ 1131.109996] PM: noirq resume of devices complete after 1.273 msecs
[ 1131.111208] PM: early resume of devices complete after 1.051 msecs
[ 1131.230277] brcmfmac: brcmf_fw_alloc_request: using
brcm/brcmfmac43455-sdio for chip BCM4345/6
[ 1131.458687] irq 66: nobody cared (try booting with the "irqpoll"
option)
[ 1131.458714] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W
6.10.0-rc3-g7fd4227d1bd5-dirty #49
[ 1131.458734] Hardware name: BCM2835
[ 1131.458744] Call trace:
[...]
[ 1131.458877] note_interrupt from handle_irq_event+0x88/0x8c
[ 1131.458900] handle_irq_event from handle_level_irq+0xb4/0x1ac
[ 1131.458923] handle_level_irq from
generic_handle_domain_irq+0x24/0x34
[ 1131.458957] generic_handle_domain_irq from
bcm2836_chained_handle_irq+0x24/0x28
[ 1131.458992] bcm2836_chained_handle_irq from
generic_handle_domain_irq+0x24/0x34
[ 1131.459024] generic_handle_domain_irq from
generic_handle_arch_irq+0x34/0x44
[ 1131.459056] generic_handle_arch_irq from __irq_svc+0x88/0xb0
[ 1131.459079] Exception stack(0xc1b01f20 to 0xc1b01f68)
[ 1131.459142] __irq_svc from default_idle_call+0x1c/0xb0
[ 1131.459167] default_idle_call from do_idle+0x21c/0x284
[ 1131.459202] do_idle from cpu_startup_entry+0x28/0x2c
[ 1131.459239] cpu_startup_entry from kernel_init+0x0/0x12c
[ 1131.459271] handlers:
[ 1131.459279] [<f539e0f4>] dwc2_handle_common_intr
[ 1131.459308] [<75cd278b>] usb_hcd_irq
[ 1131.459329] Disabling IRQ #66
[...]
An ideas what causing this issue?
Interrupts are re-enabled after the resume_noirq phase.  Looks like
the chip signals an interrupt right afterwards but the two hardirq
handlers do not feel responsible.

The only option might be to add a few printk() in
dwc2_handle_common_intr(),
usb_hcd_irq() and dwc2_handle_hcd_intr() (called from usb_hcd_irq())
to see why they're all returning IRQ_NONE without clearing the source
of the interrupt.  The chip just keeps signaling interrupts because
the driver doesn't handle them, hence the IRQ storm which the IRQ core
eventually stops by outright disabling the interrupt.
thanks for your suggestion. Unfortunately placing printk in those busy
interrupt handler is futile, so i switched to debugfs. This issue
would be much easier in case the interrupt wouldn't be shared. But
first let me share some outputs before i start to extend debugfs further:

1. No hub connected to Rpi 3 A+

root@raspberrypi:/sys/kernel/debug/usb/3f980000.usb# cat state
DCFG=0x00000000, DCTL=0x00000000, DSTS=0x0007ff02
DIEPMSK=0x00000000, DOEPMASK=0x00000000
GINTMSK=0xf3000806, GINTSTS=0x04000023
DAINTMSK=0x00000000, DAINT=0x00000000
GNPTXSTS=0x00080100, GRXSTSR=3f83bbfe

2. Hub connected before suspend / irq issue

DCFG=0x00000000, DCTL=0x00000000, DSTS=0x0007a202
DIEPMSK=0x00000000, DOEPMASK=0x00000000
GINTMSK=0xf300080e, GINTSTS=0x04000023
DAINTMSK=0x00000000, DAINT=0x00000000
GNPTXSTS=0x08080100, GRXSTSR=789a460a

3. Hub connected after suspend / irq issue

DCFG=0x00000000, DCTL=0x00000000, DSTS=0x0007ff02
DIEPMSK=0x00000000, DOEPMASK=0x00000000
GINTMSK=0xf1000806, GINTSTS=0x0500002b
DAINTMSK=0x000000ff, DAINT=0x00000000
GNPTXSTS=0x29080100, GRXSTSR=befdf595

Based on my limited knowledge and observations the issue seems related
to GINTMSK/GINTSTS and a outstanding GINTSTS_PRTINT.
i narrowed this a little bit further. At least i know the reason for the
"nobody cared". It's clear that the issue is triggered by
GINTSTS_PRTINT. The DWC2 controller is in host mode so
dwc2_handle_common_intr() ignores the interrupt and returns IRQ_NONE.
But usb_hcd_irq() also cannot handle it because HCD_FLAG_HW_ACCESSIBLE
is still clear, so the handler also returns IRQ_NONE :-(

Is disabling the IRQ via the upper layers an expected behavior instead
of letting the DWC2 controller driver resolve the situation?

But back to the root cause. I followed the suspend/resume path, why the
HCD_FLAG_HW_ACCESSIBLE is not cleared.

Suspend path:

The power down is DWC2_POWER_DOWN_PARAM_NONE so the
HCD_FLAG_HW_ACCESSIBLE is cleared (
https://elixir.bootlin.com/linux/v6.10-rc3/source/drivers/usb/dwc2/hcd.c#L4385
).

Resume path:

During resume the HPRT0_CONNSTS flag is set, so the
HCD_FLAG_HW_ACCESSIBLE is not set (
https://elixir.bootlin.com/linux/v6.10-rc3/source/drivers/usb/dwc2/hcd.c#L4435
).

Is the reason for this behavior the lack of clock gating support on
BCM283x or is it a driver bug?

How can i figure out clock gating is supported?

Regards


Regards






[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux