On Thu, May 23, 2024 at 09:59:36AM -0500, Bjorn Helgaas wrote: > [+cc iommu folks] > > On Thu, May 23, 2024 at 12:05:28PM +0530, Vidya Sagar wrote: > > For iommu_groups to form correctly, the ACS settings in the PCIe fabric > > need to be setup early in the boot process, either via the BIOS or via > > the kernel disable_acs_redir parameter. > > Can you point to the iommu code that is involved here? It sounds like > the iommu_groups are built at boot time and are immutable after that? They are created when the struct device is plugged in. pci_device_group() does the logic. Notably groups can't/don't change if details like ACS change after the groups are setup. There are alot of instructions out there telling people to boot their servers and then manually change the ACS flags with set_pci or something, and these are not good instructions since it defeats the VFIO group based security mechanisms. > If we need per-device ACS config that depends on the workload, it > seems kind of problematic to only be able to specify this at boot > time. I guess we would need to reboot if we want to run a workload > that needs a different config? Basically. The main difference I'd see is if the server is a VM host or running bare metal apps. You can get more efficicenty if you change things for the bare metal case, and often bare metal will want to turn the iommu off while a VM host often wants more of it turned on. > Is this the iommu usage model we want in the long term? There is some path to more dynamic behavior here, but it would require separating groups into two components - devices that are together because they are physically sharing translation (aliases and things) from devices that are together because they share a security boundary (ACS). It is more believable we could dynamically change security group assigments for VFIO than translation group assignment. I don't know anyone interested in this right now - Alex and I have only talked about it as a possibility a while back. FWIW I don't view patch as excluding more dynamisism in the future, but it is the best way to work with the current state of affairs, and definitely better than set_pci instructions. Thanks, Jason