On Mon, 9 Apr 2018 12:35:10 +0200 Gerd Hoffmann <kraxel@xxxxxxxxxx> wrote: > This little series adds three drivers, for demo-ing and testing vfio > display interface code. There is one mdev device for each interface > type (mdpy.ko for region and mbochs.ko for dmabuf). Erik Skultety brought up a good question today regarding how libvirt is meant to handle these different flavors of display interfaces and knowing whether a given mdev device has display support at all. It seems that we cannot simply use the default display=auto because libvirt needs to specifically configure gl support for a dmabuf type interface versus not having such a requirement for a region interface, perhaps even removing the emulated graphics in some cases (though I don't think we have boot graphics through either solution yet). Additionally, GVT-g seems to need the x-igd-opregion support enabled(?), which is a non-starter for libvirt as it's an experimental option! Currently the only way to determine display support is through the VFIO_DEVICE_QUERY_GFX_PLANE ioctl, but for libvirt to probe that on their own they'd need to get to the point where they could open the vfio device and perform the ioctl. That means opening a vfio container, adding the group, setting the iommu type, and getting the device. I was initially a bit appalled at asking libvirt to do that, but the alternative is to put this information in sysfs, but doing that we risk that we need to describe every nuance of the mdev device through sysfs and it becomes a dumping ground for every possible feature an mdev device might have. So I was ready to return and suggest that maybe libvirt should probe the device to know about these ancillary configuration details, but then I remembered that both mdev vGPU vendors had external dependencies to even allow probing the device. KVMGT will fail to open the device if it's not associated with an instance of KVM and NVIDIA vGPU, I believe, will fail if the vGPU manager process cannot find the QEMU instance to extract the VM UUID. (Both of these were bad ideas) Therefore, how can libvirt know if a given mdev device supports a display and which type of display it supports, and potentially which vendor specific options might be required to further enable that display (if they weren't experimental)? A terrible solution would be that libvirt hard codes that NVIDIA works with regions and Intel works with dmabufs, but even then there's a backwards and forwards compatibility problem, that libvirt needs to support older kernels and drivers where display support is not present and newer drivers where perhaps Intel is now doing regions and NVIDIA is supporting dmabuf, so it cannot simply be assumed based on the vendor. The only solution I see down that path would be identifying specific {vendor,type} pairs that support a predefined display type, but that's just absurd to think that vendors would rev their mdev types to expose this and that libvirt would keep a database mapping types to features. We also have the name and description attributes, but these are currently free form, so libvirt rightfully ignores them entirely. I don't know if we could create a defined feature string within those free form strings. Otherwise, it seems we have no choice but to dive into the pool of exposing such features via sysfs and we'll need to be vigilant of feature creep or vendor specific features (ex. we're not adding a feature to indicate an opregion requirement). How should we do this? Perhaps a bar we can set is that if a feature cannot be discovered through a standard vfio API, then it is not suitable for this sysfs API. Such things can be described via our existing mdev vendor specific attribute interface. We currently have this sysfs interface: mdev_supported_types/ |-- $VENDOR_TYPE | |-- available_instances | |-- create | |-- description | |-- device_api | |-- devices | `-- name ioctls for vfio devices which only provide information include: VFIO_DEVICE_GET_INFO VFIO_DEVICE_GET_REGION_INFO VFIO_DEVICE_GET_IRQ_INFO VFIO_DEVICE_GET_PCI_HOT_RESET_INFO VFIO_DEVICE_QUERY_GFX_PLANE We don't need to support all of these initially, but here's a starting idea for what this may look like in sysfs: $VENDOR_TYPE/ |-- available_instances |-- create |-- description |-- device_api |-- devices |-- name `-- vfio-pci `-- device |-- gfx_plane | |-- dmabuf | `-- region |-- irqs | |-- 0 | | |-- count | | `-- flags | `-- 1 | |-- count | `-- flags `-- regions |-- 0 | |-- flags | |-- offset | `-- size `-- 3 |-- flags |-- offset `-- size The existing device_api file reports "vfio-pci", so we base the device API info in a directory named vfio-pci. We're specifically exposing device information, so we have a device directory. We have a GFX_PLANE query ioctl, so we have a gfx_plane sub-directory. I imagine the dmabuf and region files here expose either Y/N or 1/0. I continue on the example with how we might expose irqs and regions, but even with regions we can bury down into how is sparse mmap exposed, how are device specific regions described, etc. Filling this in to completion without a specific userspace need to expose the information is just an exercise in bloating the kernel. That almost begins to look reasonable, but then we can only expose this for mdev devices, what if we were to hack a back door into a directly assigned GPU that tracks the location of active display in the framebuffer and implement the GFX_PLANE interface for that? We have no sysfs representation for either the template or the actual device for anything other than mdev. This inconsistency with physically assigned devices has been one of my arguments against enhancing mdev sysfs. Thanks to anyone still reading this. Ideas how we might help libvirt fill this information void so that they can actually configure a VM with a display device? Thanks, Alex