Hi Vidya/Will, On Sun, Apr 28, 2024 at 08:23:18AM +0100, Will Deacon wrote: > On Wed, Apr 10, 2024 at 02:28:40PM -0500, Bjorn Helgaas wrote: > > [+cc Will, Joerg] > > > > On Mon, Apr 01, 2024 at 10:40:15AM +0000, Vidya Sagar wrote: > > > Hi folks, > > > ACS (Access Control Services) is configured for a PCI device through > > > pci_enable_acs(). The first thing pci_enable_acs() checks for is > > > whether the global flag 'pci_acs_enable' is set or not. The global > > > flag 'pci_acs_enable' is set by the function pci_request_acs(). > > > > > > pci_enable_acs() function is called whenever a new PCI device is > > > added to the system > > > > > > pci_enable_acs+0x4c/0x2a4 > > > pci_acs_init+0x38/0x60 > > > pci_device_add+0x1a0/0x670 > > > pci_scan_single_device+0xc4/0x100 > > > pci_scan_slot+0x6c/0x1e0 > > > pci_scan_child_bus_extend+0x48/0x2e0 > > > pci_scan_root_bus_bridge+0x64/0xf0 > > > pci_host_probe+0x18/0xd0 > > > > > > In the case of a system that boots using device-tree blob, > > > pci_request_acs() is called when the device driver binds with the > > > respective device > > > > > > of_iommu_configure+0xf4/0x230 > > > of_dma_configure_id+0x110/0x340 > > > pci_dma_configure+0x54/0x120 > > > really_probe+0x80/0x3e0 > > > __driver_probe_device+0x88/0x1c0 > > > driver_probe_device+0x3c/0x140 > > > __device_attach_driver+0xe8/0x1e0 > > > bus_for_each_drv+0x78/0xf0 > > > __device_attach+0x104/0x1e0 > > > device_attach+0x14/0x30 > > > pci_bus_add_device+0x50/0xd0 > > > pci_bus_add_devices+0x38/0x90 > > > pci_host_probe+0x40/0xd0 > > > > > > Since the device addition always happens first followed by the > > > driver binding, this flow effectively makes sure that ACS never gets > > > enabled. > > > > > > Ideally, I would expect the pci_request_acs() get called (probably > > > by the OF framework itself) before calling pci_enable_acs(). > > > > > > This happens in the ACPI flow where pci_request_acs() is called > > > during IORT node initialization (i.e. iort_init_platform_devices() > > > function). > > > > > > Is this understanding correct? If yes, would it make sense to call > > > pci_request_acs() during OF initialization (similar to IORT > > > initialization in ACPI flow)? > > > > Your understanding looks correct to me. My call graph notes, FWIW: > > > > mem_init > > pci_iommu_alloc # x86 only > > amd_iommu_detect # init_state = IOMMU_START_STATE > > iommu_go_to_state(IOMMU_IVRS_DETECTED) > > state_next > > switch (init_state) > > case IOMMU_START_STATE: > > detect_ivrs > > pci_request_acs > > pci_acs_enable = 1 # <-- > > detect_intel_iommu > > pci_request_acs > > pci_acs_enable = 1 # <-- > > > > pci_scan_single_device # PCI enumeration > > ... > > pci_init_capabilities > > pci_acs_init > > pci_enable_acs > > if (pci_acs_enable) # <-- > > pci_std_enable_acs > > > > __driver_probe_device > > really_probe > > pci_dma_configure # pci_bus_type.dma_configure > > if (OF) > > of_dma_configure > > of_dma_configure_id > > of_iommu_configure > > pci_request_acs # <-- 6bf6c24720d3 > > iommu_probe_device > > else if (ACPI) > > acpi_dma_configure > > acpi_dma_configure_id > > acpi_iommu_configure_id > > iommu_probe_device > > > > The pci_request_acs() in of_iommu_configure(), which happens too late > > to affect pci_enable_acs(), was added by 6bf6c24720d3 ("iommu/of: > > Request ACS from the PCI core when configuring IOMMU linkage"), so I > > cc'd Will and Joerg. I don't know if that *used* to work and got > > broken somehow, or if it never worked as intended. > > I don't have any way to test this, but I'm supportive of having the same > flow for DT and ACPI-based flows. Vidya, are you able to cook a patch? > I ran into a similar observation while testing a PCI device assignment to a VM. In my configuration, the virtio-iommu is enumerated over the PCI transport. So, I am thinking we can't hook pci_request_acs() to an IOMMU driver. Does the below patch makes sense? The patch is tested with a VM and I could see ACS getting enabled and separate IOMMU groups are created for the devices attached under PCIe root port(s). The RC/devices with ACS quirks are not suffering from this problem as we short circuit ACS capability detection checking in pci_acs_enabled()->pci_dev_specific_acs_enabled() . May be this is one of the reason why this was not reported/observed by some platforms with DT. diff --git a/drivers/pci/of.c b/drivers/pci/of.c index b908fe1ae951..0eeb7abfbcfa 100644 --- a/drivers/pci/of.c +++ b/drivers/pci/of.c @@ -123,6 +123,13 @@ bool pci_host_of_has_msi_map(struct device *dev) return false; } +bool pci_host_of_has_iommu_map(struct device *dev) +{ + if (dev && dev->of_node) + return of_get_property(dev->of_node, "iommu-map", NULL); + return false; +} + static inline int __of_pci_pci_compare(struct device_node *node, unsigned int data) { diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 4c367f13acdc..ea6fcdaf63e2 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -889,6 +889,7 @@ static void pci_set_bus_msi_domain(struct pci_bus *bus) dev_set_msi_domain(&bus->dev, d); } +bool pci_host_of_has_iommu(struct device *dev); static int pci_register_host_bridge(struct pci_host_bridge *bridge) { struct device *parent = bridge->dev.parent; @@ -951,6 +952,9 @@ static int pci_register_host_bridge(struct pci_host_bridge *bridge) !pci_host_of_has_msi_map(parent)) bus->bus_flags |= PCI_BUS_FLAGS_NO_MSI; + if (pci_host_of_has_iommu_map(parent)) + pci_request_acs(); + if (!parent) set_dev_node(bus->bridge, pcibus_to_node(bus)); diff --git a/include/linux/pci.h b/include/linux/pci.h index cafc5ab1cbcb..7eceed71236a 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -2571,6 +2571,7 @@ struct device_node; struct irq_domain; struct irq_domain *pci_host_bridge_of_msi_domain(struct pci_bus *bus); bool pci_host_of_has_msi_map(struct device *dev); +bool pci_host_of_has_iommu_map(struct device *dev); /* Arch may override this (weak) */ struct device_node *pcibios_get_phb_of_node(struct pci_bus *bus); @@ -2579,6 +2580,7 @@ struct device_node *pcibios_get_phb_of_node(struct pci_bus *bus); static inline struct irq_domain * pci_host_bridge_of_msi_domain(struct pci_bus *bus) { return NULL; } static inline bool pci_host_of_has_msi_map(struct device *dev) { return false; } +static inline bool pci_host_of_has_iommu_map(struct device *dev) { return false; } #endif /* CONFIG_OF */ static inline struct device_node * Thanks, Pavan