On Thu, Oct 12, 2023 at 10:15:09AM +0530, Siddharth Vadapalli wrote: > Hello Bjorn, > > Thank you for reviewing the patch. > > On 11/10/23 19:16, Bjorn Helgaas wrote: > > Hi Siddharth, > > > > On Wed, Oct 11, 2023 at 06:04:51PM +0530, Siddharth Vadapalli wrote: > >> Since the function dw_pcie_host_init() ignores the absence of link under > >> the assumption that link can come up later, it is possible that the > >> pci_host_probe(bridge) function is invoked even when no endpoint device > >> is connected. In such a situation, the ks_pcie_v3_65_add_bus() function > >> configures BAR0 when the link is not up, resulting in Completion Timeouts > >> during the MSI configuration performed later by the PCI Express Port driver > >> to setup AER, PME and other services. Thus, leave BAR0 disabled if link is > >> not yet detected when the ks_pcie_v3_65_add_bus() function is invoked. > > > > I'm trying to make sense of this. In this path: > > > > pci_host_probe > > pci_scan_root_bus_bridge > > pci_register_host_bridge > > bus = pci_alloc_bus(NULL) # root bus > > bus->ops->add_bus(bus) > > ks_pcie_v3_65_add_bus > > > > The BAR0 in question must belong to a Root Port. And it sounds like > > the issue must be related to MSI-X, since the original MSI doesn't > > involve any BARs. > > Yes, the issue is related to MSI-X. I will list down the exact set of function > calls below as well as the place where the completion timeout first occurs: > ks_pcie_probe > dw_pcie_host_init > pci_host_probe > pci_bus_add_devices > pci_bus_add_device > device_attach > __device_attach > bus_for_each_drv > __device_attach_driver (invoked using fn(drv, data)) > driver_probe_device > __driver_probe_device > really_probe > pci_device_probe > pcie_portdrv_probe > pcie_port_device_register > pcie_init_service_irqs > pcie_port_enable_irq_vec > pci_alloc_irq_vectors > pci_alloc_irq_vectors_affinity > __pci_enable_msix_range > msix_capability_init > msix_setup_interrupts > msix_setup_msi_descs > msix_prepare_msi_desc > In this function: msix_prepare_msi_desc, the following readl() > causes completion timeout: > desc->pci.msix_ctrl = readl(addr + PCI_MSIX_ENTRY_VECTOR_CTRL); > The completion timeout with the readl is only observed when the link > is down (No Endpoint device is actually connected to the PCIe > connector slot). Do you know the address ("addr")? From pci_msix_desc_addr(), it looks like it should be: desc->pci.mask_base + desc->msi_index * PCI_MSIX_ENTRY_SIZE and desc->pci.mask_base should be dev->msix_base, which we got from msix_map_region(), which ioremaps part of the BAR indicated by the MSI-X Table Offset/Table BIR register. I wonder if this readl() is being handled as an MMIO access to a downstream device instead of a Root Port BAR access because it's inside the Root Port's MMIO window. Could you dump out these values just before the readl()? phys_addr inside msix_map_region() dev->msix_base desc->pci.mask_base desc->msi_index addr call early_dump_pci_device() on the Root Port Bjorn