> -----Original Message----- > From: Manivannan Sadhasivam <manivannan.sadhasivam@xxxxxxxxxx> > Sent: Friday, February 7, 2025 9:58 AM > To: Wilson Ding <dingwei@xxxxxxxxxxx>; cassel@xxxxxxxxxx > Cc: Bjorn Helgaas <helgaas@xxxxxxxxxx>; lpieralisi@xxxxxxxxxx; > thomas.petazzoni@xxxxxxxxxxx; kw@xxxxxxxxx; robh@xxxxxxxxxx; > bhelgaas@xxxxxxxxxx; linux-pci@xxxxxxxxxxxxxxx; linux-arm- > kernel@xxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Sanghoon Lee > <salee@xxxxxxxxxxx> > Subject: [EXTERNAL] Re: [PATCH 1/1] PCI: armada8k: Add link-down handle > > + Niklas (who was interested in link down handling) On Sat, Feb 01, 2025 > + at 11: 05: 56PM +0000, Wilson Ding wrote: > > On November 13, 2024 3: > + 02: 55 AM GMT+05: 30, Bjorn Helgaas > > <mailto: helgaas@ kernel. org> > + wrote: > > > > + Niklas (who was interested in link down handling) > > On Sat, Feb 01, 2025 at 11:05:56PM +0000, Wilson Ding wrote: > > > On November 13, 2024 3:02:55 AM GMT+05:30, Bjorn Helgaas > > > <mailto:helgaas@xxxxxxxxxx> wrote: > > > >In subject: > > > > > > > > PCI: armada8k: Add link-down handling > > > > > > > >On Mon, Nov 11, 2024 at 10:48:13PM -0800, Jenishkumar Maheshbhai > > > Patel wrote: > > > >> In PCIE ISR routine caused by RST_LINK_DOWN we schedule work to > > > >> handle the link-down procedure. > > > >> Link-down procedure will: > > > >> 1. Remove PCIe bus > > > >> 2. Reset the MAC > > > >> 3. Reconfigure link back up > > > >> 4. Rescan PCIe bus > > > > > > > >s/PCIE/PCIe/ > > > > > > > >Rewrap to fill 75 columns. > > > > > > > >I assume this basically removes a Root Port (and the hierarchy > > > >below > > > >it) if the link goes down, and then resets the MAC and tries to > > > >bring up the link and enumerate the hierarchy again. > > > > > > > >No other drivers do this, so why does armada8k need it? Is this to > > > >work around some unreliable link? > > > > > > Certainly Qcom IPs have this same feature and I was also looking to > > > implement it. But the link down should not be handled by this in the > controller driver. > > > > > > Instead, it should be tied to bus reset in the core and the reset > > > should be done through a callback implemented in the controller > > > drivers. This way, the reset cannot happen in the back of PCI core and client > drivers. > > > > > > That said, the Link down IRQ received by this driver should also be > > > propagated back to the PCI core and the core should then call the > > > callback to reset the bus that I mentioned above. > > > > > > > It's more than a work-around for the unreliable link. A few customers > > may have such application - independent power supply to the device > > with dedicated reset GPIO to #PRST. In this way, the power cycle and > > warm reset of RC and EP won't have impact on each other. However, it > > may lead into the PCI driver not aware of the link down when an unexpected > power down or reset occurs on the device. > > We cannot assume the link will be recovered soon. The worse thing is > > the driver may continue access to the device, which may hang the bus. > > Since the device is no longer present on the bus, it's better to > > remove it. Besides, in order to bring up the link, the only way is to > > reset the MAC, which starts over the state machine of LTSSM. > > > > Well, we also noticed that there is no other driver that did this. I > > agree it is not necessary if the power cycle or warm reset of the > > device is done gracefully. The user can remove the device prior to the > > power cycle/reset. And do the rescan after the link is recovered. However, > the unexpected power down is still possible. > > Please enlighten me if there is any better approach to handle such > > unexpected link down. > > > > There is no issue in retraining the link. My concern is that, the retrain should > not happen autonomously in the controller driver. PCI core should be made > informed of it. More below. > Do you mean - pass the link down/up events to PCI core - remove the device or hierarchy by PCI core upon link down - initiate the link retraining in PCI core by calling the platform retrain callbacks - rescan the bus once link is recovered > > In the meanwhile, I am quite interested the callback implementation > > suggested by Mani. But I am sure if we have such infrastructure right > > there. Mani, could you please elaborate a bit more, or is there any > > examples in the existing code and patches. > > > > There is no such implementation available right now. This idea is on my mind > for quite some time, but never had time to do it. Maybe this gives me > motivation to do so. > > Niklas: Did you get a chance to look into this? Else, let me know. I'll take a stab > at it. > > - Mani > > -- > மணிவண்ணன் சதாசிவம்