On Thu, 4 Aug 2022 21:11:07 -0300 Jason Gunthorpe <jgg@xxxxxxxxxx> wrote: > On Thu, Aug 04, 2022 at 01:36:24PM -0600, Alex Williamson wrote: > > > > > That is reasonable, but I'd say those three kernels only have two > > > > drivers and they both have vfio as a substring in their name - so the > > > > simple thing of just substring searching 'vfio' would get us over that > > > > gap. > > > > > > Looking at the aliases for exactly "vfio_pci" isn't that much more > > > complicated, and "feels" a lot more reliable than just doing a substring > > > search for "vfio" in the driver's name. (It would be, uh, .... "not > > > smart" to name a driver "vfio<anything>" if it wasn't actually a vfio > > > variant driver (or the opposite), but I could imagine it happening; :-/) > > This is still pretty hacky. I'm worried about what happens to the > kernel if this becames some crazy unintended uAPI that we never really > thought about carefully... This was not a use case when we designed > the modules.alias stuff at least. > > BTW - why not do things the normal way? > > 1. readlink /sys/bus/pci/devices/XX/iommu_group > 2. Compute basename of #1 > 3. Check if /dev/vfio/#2 exists (or /sys/class/vfio/#2) > > It has a small edge case where a multi-device group might give a false > positive for an undrivered device, but for the purposes of libvirt > that seems pretty obscure.. (while the above has false negative > issues, obviously) This is not a small edge case, it's extremely common. We have a *lot* of users assigning desktop GPUs and other consumer grade hardware, which are usually multi-function devices without isolation exposed via ACS or quirks. The vfio group exists if any devices in the group are bound to a vfio driver, but the device is not accessible from the group unless the viability test passes. That means QEMU may not be able to get access to the device because the device we want isn't actually bound to a vfio driver or another device in the group is not in a viable state. Thanks, Alex