Hi Sathyanarayanan, On Tue, Feb 18, 2025 at 10:58:19AM -0800, Sathyanarayanan Kuppuswamy wrote: > > On 2/17/25 7:48 PM, Feng Tang wrote: > > There was problem reported by firmware developers that they received > > 2 pcie link control commands in very short intervals on an ARM server, > > which doesn't comply with pcie spec, and broke their state machine and > > work flow. According to PCIe 6.1 spec, section 6.7.3.2, software needs > > to wait at least 1 second for the command-complete event, before > > resending the cmd or sending a new cmd. > > > > And the first link control command firmware received is from > > get_port_device_capability(), which sends cmd to disable pcie hotplug > > interrupts without waiting for its completion. > > Were you able to narrow down the source of the second command? The > reason I am asking is, the commit you are trying to fix seems to have > existed for 10+ years and no one had faced any issues with it. So > I am wondering whether this needs to fixed at this place or before > executing the second command. The second command comes from pcie_enable_notification(), which in our case will send command to enable hotplug interrupt again. The firmware developer found the problem when handling some device hotplug case, kind of stress case. I think maybe it's better to add the wait after this first command, which follows the PCIe spec naturally. Also v2 patch adds the logic of skipping the interrupt-disabling command, if it has been disabled earlier, either by kernel or BIOS. > > > > > Fix it by adding the necessary wait to comply with PCIe spec, referring > > pcie_poll_cmd(). > > > > Also make the interrupt disabling not dependent on whether pciehp > > service driver will be loaded as suggested by Lukas. > > May be this needs a new patch? Yes, will do. > > > > Fixes: 2bd50dd800b5 ("PCI: PCIe: Disable PCIe port services during port initialization") > > Originally-by: Liguang Zhang <zhangliguang@xxxxxxxxxxxxxxxxx> > > Suggested-by: Sathyanarayanan Kuppuswamy <sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx> > > Signed-off-by: Feng Tang <feng.tang@xxxxxxxxxxxxxxxxx> > > --- > > Code wise it looks fine to me. Thanks for the review! - Feng > > Changlog: > > > > since v1: > > * Add the Originally-by for Liguang. The issue was found on a 5.10 kernel, > > then 6.6. I was initially given a 5.10 kernel tar bar without git info to > > debug the issue, and made the patch. Thanks to Guanghui who recently pointed > > me to tree https://gitee.com/anolis/cloud-kernel which show the wait logic > > in 5.10 was originally from Liguang, and never hit mainline. > > * Make the irq disabling not dependent on wthether pciehp service driver > > will be loaded (Lukas Wunner) > > * Use read_poll_timeout() API to simply the waiting logic (Sathyanarayanan > > Kuppuswamy) > > * Add logic to skip irq disabling if it is already disabled. > > > > drivers/pci/pci.h | 2 ++ > > drivers/pci/pcie/portdrv.c | 44 +++++++++++++++++++++++++++++++++----- > > 2 files changed, 41 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h > > index 01e51db8d285..c1e234d1b81d 100644 > > --- a/drivers/pci/pci.h > > +++ b/drivers/pci/pci.h > > @@ -759,12 +759,14 @@ static inline void pcie_ecrc_get_policy(char *str) { } > > #ifdef CONFIG_PCIEPORTBUS > > void pcie_reset_lbms_count(struct pci_dev *port); > > int pcie_lbms_count(struct pci_dev *port, unsigned long *val); > > +void pcie_disable_hp_interrupts_early(struct pci_dev *dev); > > #else > > static inline void pcie_reset_lbms_count(struct pci_dev *port) {} > > static inline int pcie_lbms_count(struct pci_dev *port, unsigned long *val) > > { > > return -EOPNOTSUPP; > > } > > +static inline void pcie_disable_hp_interrupts_early(struct pci_dev *dev) {} > > #endif > > struct pci_dev_reset_methods { > > diff --git a/drivers/pci/pcie/portdrv.c b/drivers/pci/pcie/portdrv.c > > index 02e73099bad0..2470333bba2f 100644 > > --- a/drivers/pci/pcie/portdrv.c > > +++ b/drivers/pci/pcie/portdrv.c > > @@ -18,6 +18,7 @@ > > #include <linux/string.h> > > #include <linux/slab.h> > > #include <linux/aer.h> > > +#include <linux/iopoll.h> > > #include "../pci.h" > > #include "portdrv.h" > > @@ -205,6 +206,40 @@ static int pcie_init_service_irqs(struct pci_dev *dev, int *irqs, int mask) > > return 0; > > } > > +static int pcie_wait_sltctl_cmd_raw(struct pci_dev *dev) > > +{ > > + u16 slot_status = 0; > > + int ret, ret1, timeout_us; > > + > > + /* 1 second, according to PCIe spec 6.1, section 6.7.3.2 */ > > + timeout_us = 1000000; > > + ret = read_poll_timeout(pcie_capability_read_word, ret1, > > + (slot_status & PCI_EXP_SLTSTA_CC), 10000, > > + timeout_us, true, dev, PCI_EXP_SLTSTA, > > + &slot_status); > > + if (!ret) > > + pcie_capability_write_word(dev, PCI_EXP_SLTSTA, > > + PCI_EXP_SLTSTA_CC); > > + > > + return ret; > > +} > > + > > +void pcie_disable_hp_interrupts_early(struct pci_dev *dev) > > +{ > > + u16 slot_ctrl = 0; > > + > > + pcie_capability_read_word(dev, PCI_EXP_SLTCTL, &slot_ctrl); > > + /* Bail out early if it is already disabled */ > > + if (!(slot_ctrl & (PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE))) > > + return; > > + > > + pcie_capability_clear_word(dev, PCI_EXP_SLTCTL, > > + PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE); > > + > > + if (pcie_wait_sltctl_cmd_raw(dev)) > > + pci_info(dev, "Timeout on disabling PCIE hot-plug interrupt\n"); > > +} > > + > > /** > > * get_port_device_capability - discover capabilities of a PCI Express port > > * @dev: PCI Express port to examine > > @@ -222,16 +257,15 @@ static int get_port_device_capability(struct pci_dev *dev) > > if (dev->is_hotplug_bridge && > > (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT || > > - pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM) && > > - (pcie_ports_native || host->native_pcie_hotplug)) { > > - services |= PCIE_PORT_SERVICE_HP; > > + pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM)) { > > + if (pcie_ports_native || host->native_pcie_hotplug) > > + services |= PCIE_PORT_SERVICE_HP; > > /* > > * Disable hot-plug interrupts in case they have been enabled > > * by the BIOS and the hot-plug service driver is not loaded. > > */ > > - pcie_capability_clear_word(dev, PCI_EXP_SLTCTL, > > - PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE); > > + pcie_disable_hp_interrupts_early(dev); > > } > > #ifdef CONFIG_PCIEAER > > -- > Sathyanarayanan Kuppuswamy > Linux Kernel Developer