On 09/01/2016 12:59 PM, Alex Williamson wrote:
On Thu, 1 Sep 2016 18:47:06 +0200 Michal Privoznik <mprivozn@xxxxxxxxxx> wrote:On 31.08.2016 08:12, Tian, Kevin wrote:From: Alex Williamson [mailto:alex.williamson@xxxxxxxxxx] Sent: Wednesday, August 31, 2016 12:17 AM Hi folks, At KVM Forum we had a BoF session primarily around the mediated device sysfs interface. I'd like to share what I think we agreed on and the "problem areas" that still need some work so we can get the thoughts and ideas from those who weren't able to attend. DanPB expressed some concern about the mdev_supported_types sysfs interface, which exposes a flat csv file with fields like "type", "number of instance", "vendor string", and then a bunch of type specific fields like "framebuffer size", "resolution", "frame rate limit", etc. This is not entirely machine parsing friendly and sort of abuses the sysfs concept of one value per file. Example output taken from Neo's libvirt RFC: cat /sys/bus/pci/devices/0000:86:00.0/mdev_supported_types # vgpu_type_id, vgpu_type, max_instance, num_heads, frl_config, framebuffer, max_resolution 11 ,"GRID M60-0B", 16, 2, 45, 512M, 2560x1600 12 ,"GRID M60-0Q", 16, 2, 60, 512M, 2560x1600 13 ,"GRID M60-1B", 8, 2, 45, 1024M, 2560x1600 14 ,"GRID M60-1Q", 8, 2, 60, 1024M, 2560x1600 15 ,"GRID M60-2B", 4, 2, 45, 2048M, 2560x1600 16 ,"GRID M60-2Q", 4, 4, 60, 2048M, 2560x1600 17 ,"GRID M60-4Q", 2, 4, 60, 4096M, 3840x2160 18 ,"GRID M60-8Q", 1, 4, 60, 8192M, 3840x2160 The create/destroy then looks like this: echo "$mdev_UUID:vendor_specific_argument_list" > /sys/bus/pci/devices/.../mdev_create echo "$mdev_UUID:vendor_specific_argument_list" > /sys/bus/pci/devices/.../mdev_destroy "vendor_specific_argument_list" is nebulous. So the idea to fix this is to explode this into a directory structure, something like: ├── mdev_destroy └── mdev_supported_types ├── 11 │ ├── create │ ├── description │ └── max_instances ├── 12 │ ├── create │ ├── description │ └── max_instances └── 13 ├── create ├── description └── max_instances Note that I'm only exposing the minimal attributes here for simplicity, the other attributes would be included in separate files and we would require vendors to create standard attributes for common device classes.I like this idea. All standard attributes are reflected into this hierarchy. In the meantime, can we still allow optional vendor string in create interface? libvirt doesn't need to know the meaning, but allows upper layer to do some vendor specific tweak if necessary.This is not the best idea IMO. Libvirt is there to shadow differences between hypervisors. While doing that, we often hide differences between various types of HW too. Therefore in order to provide good abstraction we should make vendor specific string as small as possible (ideally an empty string). I mean I see it as bad idea to expose "vgpu_type_id" from example above in domain XML. What I think the better idea is if we let users chose resolution and frame buffer size, e.g.: <video resolution="1024x768" framebuffer="16"/> (just the first idea that came to my mind while writing this e-mail). The point is, XML part is completely free of any vendor-specific knobs.That's not really what you want though, a user actually cares whether they get an Intel of NVIDIA vGPU, we can't specify it as just a resolution and framebuffer size. The user also doesn't want the model changing each time the VM is started, so not only do you *need* to know the vendor, you need to know the vendor model
as well as any other configuration that might change over time. A similar issue - libvirt really doesn't know or care what a "chassis" is in an ioh3420 (a PCIe root-port), but it's a guest-visible property of the device that qemu can set (and could presumably decide to change the default setting of some time in the future), so libvirt has to set a value for it in the config, and specify it on the qemu commandline.
What I'm getting at is that if there is anything in the vendor-specific string that changes guest ABI, and that could change over time, then libvirt can't just rely on it remaining the same, it needs to have it saved in the config for later reproduction, even if it doesn't understand the contents.
(for that matter, you may want to consider some type of "versioned vGPU type" similar to qemu's versions machinetypes (e.g. "pc-i440fx-2.6", which has some sort of incompatible ABI differences from "pc-i440fx-1.4"), where any guest-ABI-changing modifications to the vGPU would take effect only when the appropriate version of device was requested. That way a guest originally created to use today's version of vGPU X in resolution Y would continue to work even if incompatible guest ABI changes were made in the future.)
-- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html