On Fri, Jun 25, 2021 at 12:18:48AM +0100, Robin Murphy wrote: > On 2021-06-24 22:57, Bjorn Helgaas wrote: > > On Tue, Jun 08, 2021 at 10:04:09AM +0200, Javier Martinez Canillas wrote: > > > IRQ handlers that are registered for shared interrupts can be called at > > > any time after have been registered using the request_irq() function. > > > > > > It's up to drivers to ensure that's always safe for these to be called. > > > > > > Both the "pcie-sys" and "pcie-client" interrupts are shared, but since > > > their handlers are registered very early in the probe function, an error > > > later can lead to these handlers being executed before all the required > > > resources have been properly setup. > > > > > > For example, the rockchip_pcie_read() function used by these IRQ handlers > > > expects that some PCIe clocks will already be enabled, otherwise trying > > > to access the PCIe registers causes the read to hang and never return. > > > > The read *never* completes? That might be a bit problematic because > > it implies that we may not be able to recover from PCIe errors. Most > > controllers will timeout eventually, log an error, and either > > fabricate some data (typically ~0) to complete the CPU's read or cause > > some kind of abort or machine check. > > > > Just asking in case there's some controller configuration that should > > be tweaked. > > If I'm following correctly, that'll be a read transaction to the native side > of the controller itself; it can't complete that read, or do anything else > either, because it's clock-gated, and thus completely oblivious (it might be > that if another CPU was able to enable the clocks then everything would > carry on as normal, or it might end up totally deadlocking the SoC > interconnect). I think it's safe to assume that in that state nothing of > importance would be happening on the PCIe side, and even if it was we'd > never get to know about it. Oh, right, that makes sense. I was thinking about the PCIe side, but if the controller itself isn't working, of course we wouldn't get that far. I would expect that the CPU itself would have some kind of timeout for the read, but that's far outside of the PCI world. Bjorn