On Wed, Aug 14, 2024 at 05:28:37AM +0900, Krzysztof Wilczyński wrote: > > Currently, the endpoint cleanup function dw_pcie_ep_cleanup() and EPF > > deinit notify function pci_epc_deinit_notify() are called during the > > execution of qcom_pcie_perst_assert() i.e., when the host has asserted > > PERST#. But quickly after this step, refclk will also be disabled by the > > host. > > > > All of the Qcom endpoint SoCs supported as of now depend on the refclk from > > the host for keeping the controller operational. Due to this limitation, > > any access to the hardware registers in the absence of refclk will result > > in a whole endpoint crash. Unfortunately, most of the controller cleanups > > require accessing the hardware registers (like eDMA cleanup performed in > > dw_pcie_ep_cleanup(), powering down MHI EPF etc...). So these cleanup > > functions are currently causing the crash in the endpoint SoC once host > > asserts PERST#. > > > > One way to address this issue is by generating the refclk in the endpoint > > itself and not depending on the host. But that is not always possible as > > some of the endpoint designs do require the endpoint to consume refclk from > > the host (as I was told by the Qcom engineers). > > > > So let's fix this crash by moving the controller cleanups to the start of > > the qcom_pcie_perst_deassert() function. qcom_pcie_perst_deassert() is > > called whenever the host has deasserted PERST# and it is guaranteed that > > the refclk would be active at this point. So at the start of this function, > > the controller cleanup can be performed. Once finished, rest of the code > > execution for PERST# deassert can continue as usual. > > Applied to controller/qcom, thank you! > > [1/1] PCI: qcom-ep: Move controller cleanups to qcom_pcie_perst_deassert() > https://git.kernel.org/pci/pci/c/6960cdc1ef97 I dropped this for now, looking for a new simpler version without "cleanup_pending" and a similar change for tegra194 (separate patch). I think it's still an open question whether both pci_epc_deinit_notify() and pci_epc_init_notify() are needed, but that should be separate and I don't think that would fix a crash. You said this was not strictly v6.11 material, but it does fix a crash, and it only touches the endpoint driver, so ... it seems like a possible candidate, especially if we can identify a recent commit that caused the crash. Bjorn