[+cc other drivers that use the broken "is link up" test in config access] On Fri, Dec 20, 2019 at 05:11:12PM +0530, Bharat Kumar Gogada wrote: > - Adding support for versal CPM as Root Port. > - The Versal ACAP devices include CCIX-PCIe Module (CPM). The integrated > block for CPM along with the integrated bridge can function > as PCIe Root Port. > - Versal CPM uses GICv3 ITS feature for assigning MSI/MSI-X > vectors and handling MSI/MSI-X interrupts. > - Bridge error and legacy interrupts in versal CPM are handled using > versal CPM specific MISC interrupt line. "Versal" appears to be a brand name and should be capitalized consistently. > +#define INTX_NUM 4 Don't we have a generic PCI core definition for this? > +static inline bool cpm_pcie_link_is_up(struct xilinx_cpm_pcie_port *port) Please follow the example of other drivers and name this "cpm_pcie_link_up()". > +{ > + return (pcie_read(port, XILINX_CPM_PCIE_REG_PSCR) & > + XILINX_CPM_PCIE_REG_PSCR_LNKUP) ? 1 : 0; > +} > +/** > + * xilinx_cpm_pcie_valid_device - Check if a valid device is present on bus > + * @bus: PCI Bus structure > + * @devfn: device/function > + * > + * Return: 'true' on success and 'false' if invalid device is found > + */ > +static bool xilinx_cpm_pcie_valid_device(struct pci_bus *bus, > + unsigned int devfn) > +{ > + struct xilinx_cpm_pcie_port *port = bus->sysdata; > + > + /* Check if link is up when trying to access downstream ports */ > + if (bus->number != port->root_busno) > + if (!cpm_pcie_link_is_up(port)) > + return false; This check for the link being up is racy and should be removed. The link may go down after the check and before the config access. A config access to a device where the link is down is an error, but it's an error we expect to happen because of enumeration, surprise unplug, or electrical issue. It's impossible to avoid so we must be able to handle and recover from it. I know there are other drivers that do this (dwc, altera, xilinix-nwl, xilinx), and I've pointed this out many times in the past. This needs to be fixed. > + > + /* Only one device down on each root port */ > + if (bus->number == port->root_busno && devfn > 0) > + return false; > + > + return true; > +} > +static irqreturn_t xilinx_cpm_pcie_intr_handler(int irq, void *data) > +{ > + struct xilinx_cpm_pcie_port *port = > + (struct xilinx_cpm_pcie_port *)data; struct device *dev = port->dev; to reduce repetition of "port->dev" below. > + u32 val, mask, status, bit; > + unsigned long intr_val; > + > + /* Read interrupt decode and mask registers */ > + val = pcie_read(port, XILINX_CPM_PCIE_REG_IDR); > + mask = pcie_read(port, XILINX_CPM_PCIE_REG_IMR); > + > + status = val & mask; > + if (!status) > + return IRQ_NONE; > + > + if (status & XILINX_CPM_PCIE_INTR_LINK_DOWN) > + dev_warn(port->dev, "Link Down\n"); > + > + if (status & XILINX_CPM_PCIE_INTR_HOT_RESET) > + dev_info(port->dev, "Hot reset\n"); > ...