On Tue, Jun 15, 2021 at 10:12:15AM -0600, Alex Williamson wrote: > > 1) A dual-function PCIe e1000e NIC where the functions are grouped > together due to ACS isolation issues. > > a) Initial state: functions 0 & 1 are both bound to e1000e driver. > > b) Admin uses driverctl to bind function 1 to vfio-pci, creating > vfio device file, which is chmod'd to grant to a user. > > c) User opens vfio function 1 device file and an iommu_fd, binds > device_fd to iommu_fd. > > Does this succeed? > - if no, specifically where does it fail? No, the e1000e driver is still connected to the device. It fails during the VFIO_BIND_IOASID_FD call because the iommu common code checks the group membership for consistency. We detect it basically the same way things work today, just moved to the iommu code. > d) Repeat b) for function 0. > e) Repeat c), still using function 1, is it different? Where? Why? Succeeds because all group device members are now bound to vfio It is hard to predict the nicest way to do all of this, but I would start by imagining that iommu_fd using drivers (like vfio) will call some kind of iommu_fd_allow_dma_blocking() call during their probe() which organizes the machinery to drive this. > 2) The same NIC as 1) > > a) Initial state: functions 0 & 1 bound to vfio-pci, vfio device > files granted to user, user has bound both device_fds to the same > iommu_fd. > > AIUI, even though not bound to an IOASID, vfio can now enable access > through the device_fds, right? Yes > What specific entity has placed these > devices into a block DMA state, when, and how? To keep all the semantics the same it must be done as part of VFIO_BIND_IOASID_FD. This will have to go over every device in the group and put it in the dma blocked state. Riffing on the above this is possible if there is no attached device driver, or the device driver that is attached has called iommu_fd_allow_dma_blocking() during its probe() I haven't gone through all of Kevins notes about how this could be sorted out directly in the iomumu code though.. > b) Both devices are attached to the same IOASID. > > Are we assuming that each device was atomically moved to the new > IOMMU context by the IOASID code? What if the IOMMU cannot change > the domain atomically? What does "atomically" mean here? I assume all IOMMU HW can change IOASIDs without accidentally leaking traffic through. Otherwise that is a major design restriction.. > c) The device_fd for function 1 is detached from the IOASID. > > Are we assuming the reverse of b) performed by the IOASID code? Yes, the IOMMU will change from the active IOASID to the "block DMA" ioasid in a way that is secure. > d) The device_fd for function 1 is unbound from the iommu_fd. > > Does this succeed? Yes > - if yes, what is the resulting IOMMU context of the device and > who owns it? device_fd for function 1 remains set to the "block DMA" ioasid. Attempting to attach a kernel driver triggers bug_on as today Attempting to open it again and use it with a different iommu_fd fails > e) Function 1 is unbound from vfio-pci. > > Does this work or is it blocked? If blocked, by what entity > specifically? As today, it is allowed. The IOASID would have to remain at the "block all dma" until the implicit connection to the group in the iommu_fd is released. > f) Function 1 is bound to e1000e driver. As today bug_on is triggered via the same maze of notifiers (gross, but where we are for now). The notifiers would be done by the iommu_fd instead of vfio > 3) A dual-function conventional PCI e1000 NIC where the functions are > grouped together due to shared RID. This operates effectively the same as today. Manipulating a device implicitly manipulates the group. Instead of doing dma block the devices track the IOASID the group is using. We model it by demanding that all devices attach to the same IOASID and instead of doing the DMA block step the device remains attached to the group's IOASID. Today this is such an uncommon configuration (a PCI bridge!) we shouldn't design the entire API around it. > If vfio gets to offload all of it's group management to IOASID code, > that's great, but I'm afraid that IOASID is so focused on a > device-level API that we're instead just ignoring the group dynamics > and vfio will be forced to provide oversight to maintain secure > userspace access. I think it would be a major design failure if VFIO is required to provide additional security on top of the iommu code. This is basically the refactoring excercise - to move the VFIO code that is only about iommu concerns to the iommu layer and VFIO becomes thinner. Otherwise we still can't properly share this code - why should VDPA and VFIO have different isolation models? Is it just because we expect that everything except VFIO has 1:1 groups or not group at all? Feels wonky. Jason