On 13/10/23 10:33, Siddharth Vadapalli wrote: > > > On 12/10/23 22:13, Bjorn Helgaas wrote: >> On Thu, Oct 12, 2023 at 10:15:09AM +0530, Siddharth Vadapalli wrote: >>> Hello Bjorn, >>> >>> Thank you for reviewing the patch. >>> >>> On 11/10/23 19:16, Bjorn Helgaas wrote: >>>> Hi Siddharth, >>>> > > ... > >>> msix_prepare_msi_desc >>> In this function: msix_prepare_msi_desc, the following readl() >>> causes completion timeout: >>> desc->pci.msix_ctrl = readl(addr + PCI_MSIX_ENTRY_VECTOR_CTRL); >>> The completion timeout with the readl is only observed when the link >>> is down (No Endpoint device is actually connected to the PCIe >>> connector slot). >> >> Do you know the address ("addr")? From pci_msix_desc_addr(), it looks >> like it should be: >> >> desc->pci.mask_base + desc->msi_index * PCI_MSIX_ENTRY_SIZE >> >> and desc->pci.mask_base should be dev->msix_base, which we got from >> msix_map_region(), which ioremaps part of the BAR indicated by the >> MSI-X Table Offset/Table BIR register. >> >> I wonder if this readl() is being handled as an MMIO access to a >> downstream device instead of a Root Port BAR access because it's >> inside the Root Port's MMIO window. >> >> Could you dump out these values just before the readl()? >> >> phys_addr inside msix_map_region() >> dev->msix_base >> desc->pci.mask_base >> desc->msi_index >> addr > > phys_addr: 0x10102000 > msix_base: 0xffff80000997a000 > mask_base: 0xffff80000997a000 > msi_index: 0 > addr: 0xffff80000997a000 > > Also, the details of BAR allocation from the logs are: > keystone-pcie 5500000.pcie: host bridge /bus@100000/pcie@5500000 ranges: > keystone-pcie 5500000.pcie: IO 0x0010020000..0x001002ffff -> 0x0000000000 > keystone-pcie 5500000.pcie: MEM 0x0010030000..0x0017ffffff -> 0x0010030000 > keystone-pcie 5500000.pcie: iATU unroll: enabled > keystone-pcie 5500000.pcie: iATU regions: 8 ob, 8 ib, align 64K, limit 4G > keystone-pcie 5500000.pcie: Phy link never came up > keystone-pcie 5500000.pcie: PCI host bridge to bus 0000:00 > pci_bus 0000:00: root bus resource [bus 00-ff] > pci_bus 0000:00: root bus resource [io 0x0000-0xffff] > pci_bus 0000:00: root bus resource [mem 0x10030000-0x17ffffff] > pci 0000:00:00.0: [104c:b00c] type 01 class 0x060400 > pci 0000:00:00.0: reg 0x10: [mem 0x05500000-0x055fffff] > pci 0000:00:00.0: reg 0x38: [mem 0x00000000-0x0000ffff pref] > pci 0000:00:00.0: supports D1 > pci 0000:00:00.0: PME# supported from D0 D1 D3hot > pci 0000:00:00.0: BAR 0: assigned [mem 0x10100000-0x101fffff] > pci 0000:00:00.0: BAR 6: assigned [mem 0x10030000-0x1003ffff pref] > pci 0000:00:00.0: PCI bridge to [bus 01-ff] > > The value of phys_addr lies within the range allocated to BAR0. > >> call early_dump_pci_device() on the Root Port > > I invoked early_dump_pci_device() within the pci_setup_device() function in > drivers/pci/probe.c and the output is: > > pci 0000:00:00.0: config space: > 00000000: 4c 10 0c b0 07 01 10 00 01 00 04 06 00 00 01 00 > 00000010: 00 00 50 05 00 00 00 00 00 01 ff 00 00 00 00 00 > 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00000030: 00 00 00 00 40 00 00 00 00 00 00 00 ff 01 00 00 > 00000040: 01 50 c3 5b 08 00 00 00 00 00 00 00 00 00 00 00 > 00000050: 05 70 80 01 00 00 00 00 00 00 00 00 00 00 00 00 > 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00000070: 10 b0 42 00 01 80 00 00 10 20 00 00 22 3c 73 00 > 00000080: 00 00 11 10 00 00 00 00 c0 03 40 00 00 00 01 00 > 00000090: 00 00 00 00 1f 04 00 00 00 00 00 00 06 00 00 00 > 000000a0: 02 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 > 000000b0: 11 00 00 01 00 20 00 00 00 40 00 00 00 00 00 00 > 000000c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 000000d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 000000e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 000000f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 I also noticed that the value of desc->pci.msix_ctrl obtained from the readl is always 0xffffffff irrespective of whether or not an endpoint device is connected. This isn't expected right? The only difference between the cases where endpoint device is connected and isn't connected is the completion timeout. -- Regards, Siddharth.