[+cc Vidya, Jon since tegra194 does similar things] On Mon, Jul 29, 2024 at 05:52:45PM +0530, Manivannan Sadhasivam wrote: > Currently, the endpoint cleanup function dw_pcie_ep_cleanup() and EPF > deinit notify function pci_epc_deinit_notify() are called during the > execution of qcom_pcie_perst_assert() i.e., when the host has asserted > PERST#. But quickly after this step, refclk will also be disabled by the > host. > > All of the Qcom endpoint SoCs supported as of now depend on the refclk from > the host for keeping the controller operational. Due to this limitation, > any access to the hardware registers in the absence of refclk will result > in a whole endpoint crash. Unfortunately, most of the controller cleanups > require accessing the hardware registers (like eDMA cleanup performed in > dw_pcie_ep_cleanup(), powering down MHI EPF etc...). So these cleanup > functions are currently causing the crash in the endpoint SoC once host > asserts PERST#. > > One way to address this issue is by generating the refclk in the endpoint > itself and not depending on the host. But that is not always possible as > some of the endpoint designs do require the endpoint to consume refclk from > the host (as I was told by the Qcom engineers). > > So let's fix this crash by moving the controller cleanups to the start of > the qcom_pcie_perst_deassert() function. qcom_pcie_perst_deassert() is > called whenever the host has deasserted PERST# and it is guaranteed that > the refclk would be active at this point. So at the start of this function, > the controller cleanup can be performed. Once finished, rest of the code > execution for PERST# deassert can continue as usual. What makes this v6.11 material? Does it fix a problem we added in v6.11-rc1? Is there a Fixes: commit? This patch essentially does this: qcom_pcie_perst_assert - pci_epc_deinit_notify - dw_pcie_ep_cleanup qcom_pcie_disable_resources qcom_pcie_perst_deassert + if (pcie_ep->cleanup_pending) + pci_epc_deinit_notify(pci->ep.epc); + dw_pcie_ep_cleanup(&pci->ep); dw_pcie_ep_init_registers pci_epc_init_notify Maybe it makes sense to call both pci_epc_deinit_notify() and pci_epc_init_notify() from the PERST# deassert function, but it makes me question whether we really need both. pcie-tegra194.c has a similar structure: pex_ep_event_pex_rst_assert pci_epc_deinit_notify dw_pcie_ep_cleanup pex_ep_event_pex_rst_deassert dw_pcie_ep_init_registers pci_epc_init_notify Is there a reason to make them different, or could/should a similar change be made to tegra? > Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@xxxxxxxxxx> > --- > drivers/pci/controller/dwc/pcie-qcom-ep.c | 12 ++++++++++-- > 1 file changed, 10 insertions(+), 2 deletions(-) > > diff --git a/drivers/pci/controller/dwc/pcie-qcom-ep.c b/drivers/pci/controller/dwc/pcie-qcom-ep.c > index 2319ff2ae9f6..e024b4dcd76d 100644 > --- a/drivers/pci/controller/dwc/pcie-qcom-ep.c > +++ b/drivers/pci/controller/dwc/pcie-qcom-ep.c > @@ -186,6 +186,8 @@ struct qcom_pcie_ep_cfg { > * @link_status: PCIe Link status > * @global_irq: Qualcomm PCIe specific Global IRQ > * @perst_irq: PERST# IRQ > + * @cleanup_pending: Cleanup is pending for the controller (because refclk is > + * needed for cleanup) > */ > struct qcom_pcie_ep { > struct dw_pcie pci; > @@ -214,6 +216,7 @@ struct qcom_pcie_ep { > enum qcom_pcie_ep_link_status link_status; > int global_irq; > int perst_irq; > + bool cleanup_pending; > }; > > static int qcom_pcie_ep_core_reset(struct qcom_pcie_ep *pcie_ep) > @@ -389,6 +392,12 @@ static int qcom_pcie_perst_deassert(struct dw_pcie *pci) > return ret; > } > > + if (pcie_ep->cleanup_pending) { Do we really need this flag? I assume the cleanup functions could tell whether any previous setup was done? > + pci_epc_deinit_notify(pci->ep.epc); > + dw_pcie_ep_cleanup(&pci->ep); > + pcie_ep->cleanup_pending = false; > + } > + > /* Assert WAKE# to RC to indicate device is ready */ > gpiod_set_value_cansleep(pcie_ep->wake, 1); > usleep_range(WAKE_DELAY_US, WAKE_DELAY_US + 500); > @@ -522,10 +531,9 @@ static void qcom_pcie_perst_assert(struct dw_pcie *pci) > { > struct qcom_pcie_ep *pcie_ep = to_pcie_ep(pci); > > - pci_epc_deinit_notify(pci->ep.epc); > - dw_pcie_ep_cleanup(&pci->ep); > qcom_pcie_disable_resources(pcie_ep); > pcie_ep->link_status = QCOM_PCIE_EP_LINK_DISABLED; > + pcie_ep->cleanup_pending = true; > } > > /* Common DWC controller ops */ > -- > 2.25.1 >