> From: Alex Williamson <alex.williamson@xxxxxxxxxx> > Sent: Saturday, August 3, 2024 2:25 AM > > On Thu, 1 Aug 2024 07:45:43 +0000 > "Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote: > > > > From: Alex Williamson <alex.williamson@xxxxxxxxxx> > > > Sent: Thursday, August 1, 2024 1:05 AM > > > > > > On Wed, 31 Jul 2024 05:15:25 +0000 > > > "Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote: > > > > > > > > From: Alex Williamson <alex.williamson@xxxxxxxxxx> > > > > > Sent: Wednesday, July 31, 2024 1:35 AM > > > > > > > > > > So what are we trying to accomplish here. PASID is the first > > > > > non-device specific virtual capability that we'd like to insert into > > > > > the VM view of the capability chain. It won't be the last. > > > > > > > > > > - Do we push the policy of defining the capability offset to the user? > > > > > > > > Looks yes as I didn't see a strong argument for the opposite way. > > > > > > It's a policy choice though, so where and how is it implemented? It > > > works fine for those of us willing to edit xml or launch VMs by command > > > line, but libvirt isn't going to sign up to insert a policy choice for > > > a device. If we get to even higher level tools, does anything that > > > wants to implement PASID support required a vendor operator driver to > > > make such policy choices (btw, I'm just throwing out the "operator" > > > term as if I know what it means, I don't). > > > > I had a rough feeling that there might be other usages requiring such > > vendor plugin, e.g. provisioning VF/ADI may require vendor specific > > configurations, but not really an expert in this area. > > > > Overall I feel most of our discussions so far are about VMM-auto- > > find-offset vs. file-based-policy-scheme which both belong to > > user-defined policy, suggesting that we all agreed to drop the other > > way having kernel define the offset (plus in-kernel quirks, etc.)? > > > > Even the said DVSEC is to assist such user-defined direction. > > To me a "user defined policy" is placing an option on the command line > and requiring the user, or some higher level authority representing the > user, to provide the policy. If it's done by the VMM then we're saying > QEMU owns the policy but it might be overridden by the user via a > command line argument or modifying the policy file consumed by QEMU. Okay, I see the difference here. In my reply it's clearer to say "userspace" instead of "user". 😊 > > > involved with that, so why does it make sense for vfio-pci to be > > > involved in reporting something that is more iommufd specific? > > > > It doesn't matter which one involves more. It's more akin to the > > physical world. > > > > btw vfio-pci already reports ATS/PRI which both rely on iommufd > > in vconfig space. Throwing PASID alone to iommufd uAPI lacks of a > > good justification for why it's special. > > > > I envision an extension to vfio device feature or a new vfio uAPI > > for reporting virtual capabilities as augment to the ones filled in > > vconfig space. > > Should ATS and PRI be reported through vfio-pci or should we just turn > them off to be more like PASID? Maybe the issue simply hasn't arisen > yet because we don't have vIOMMU support and with that support QEMU > might need to filter out those capabilities and look elsewhere. > Anyway, iommufd and vfio-pci should not duplicate each other here. If no-duplication is the agreed way, yes it's clearer to have ATS/PRI/PASID reported consistently via iommufd hence hidden in vfio-pci. Given there is no vIOMMU support this change shouldn't break any applications. > > > > > > then we just look for a gap and add the capability. If we end up with > > > > > different results between source and target for migration, then > > > > > migration will fail. Possibly we end up with a quirk table to override > > > > > the default placement of specific capabilities on specific devices. > > > > > > > > emm how does a quirk table work with devices having volatile config > > > > space layout cross FW versions? Can VMM assigned with a VF be able > > > > to check the FW version of the PF? > > > > > > If the VMM can't find the same gap between source and destination then > > > a quirk could make sure that the PASID offset is consistent. But also > > > if the VMM doesn't find the same gap then that suggests the config > > > space is already different and not only the offset of the PASID > > > capability will need to be fixed via a quirk, so then we're into > > > quirking the entire capability space for the device. > > > > yes. So the quirk table is more for fixing the functional gap (i.e. not > > overlap with a hidden register) instead of for migration. As long as > > a device can function correctly with it, the virtual caps fall into the > > same restriction as physical caps in migration i.e. upon inconsistent > > layout between src/dest we'll need separate way to synthesize the > > entire space. > > Yes. > > > > The VMM should not be assumed to have any additional privileges > beyond > > > what we provide it through the vfio device and iommufd interface. > > > Testing anything about the PF would require access on the host that > > > won't work in more secure environments. Therefore if we can't > > > consistently place the PASID for a device, we probably need to quirk it > > > based on the vendor/device IDs or sub-IDs or we need to rely on a > > > management implied policy such as a device profile option on the QEMU > > > command line or maybe different classes of the vfio-pci driver in QEMU. > > > > > > > > That might evolve into a lookup for where we place all capabilities, > > > > > which essentially turns into the "file" where the VMM defines the > entire > > > > > layout for some devices. > > > > > > > > Overall this sounds a feasible path to move forward - starting with > > > > the VMM to find the gap automatically if a new PASID option is > > > > opted in. Devices with hidden registers may fail. Devices with volatile > > > > config space due to FW upgrade or cross vendors may fail to migrate. > > > > Then evolving it to the file-based scheme, and there is time to discuss > > > > any intermediate improvement (fixed quirks, cmdline offset, etc.) in > > > > between. > > > > > > As above, let's be careful about introducing unnecessary command line > > > options, especially if we expect support for them in higher level > > > tools. If we place the PASID somewhere that makes the device not work, > > > then disabling PASID on the vIOMMU should resolve that. It won't be a > > > > vIOMMU is per-platform then it applies to all devices behind, including > > those which don't have a problem with auto-selected offset. Not sure > > whether one would want to continue enabling PASID for other devices > > or should stop immediately to find a quirk for the problematic one and > > then resume. > > I'm not sure if this is a real issue, we're talking about a VM, not a > server. If a user wants PASID support and it's incompatible with a > device, the device can be excluded from the VM or we can have an > experimental option on the vfio-pci device in QEMU as a workaround. I > don't think this is something we need to plumb up through the tool > stack. Thanks, > Okay. With that I edited my earlier reply a bit by removing the note of cmdline option, adding DVSEC possibility, and making it clear that the PASID option is in vIOMMU: " Overall this sounds a feasible path to move forward - starting with the VMM to find the gap automatically if PASID is opted in vIOMMU. Devices with hidden registers may fail. Devices with volatile config space due to FW upgrade or cross vendors may fail to migrate. Then evolving it to the file-based scheme, and there is time to discuss any intermediate improvement (fixed quirks, DVSEC, etc.) in between. " Jason, your thoughts?