On 9/3/2016 6:37 PM, Paolo Bonzini wrote: > > > On 03/09/2016 13:56, John Ferlan wrote: >> On 09/02/2016 05:48 PM, Paolo Bonzini wrote: >>> On 02/09/2016 20:33, Kirti Wankhede wrote: >>>> <Alex> We could even do: >>>>>> >>>>>> echo $UUID1:$GROUPA > create >>>>>> >>>>>> where $GROUPA is the group ID of a previously created mdev device into >>>>>> which $UUID1 is to be created and added to the same group. >>>> </Alex> >>> >>> >From the point of view of libvirt, I think I prefer Alex's idea. >>> <group> could be an additional element in the nodedev-create XML: >>> >>> <device> >>> <name>my-vgpu</name> >>> <parent>pci_0000_86_00_0</parent> >>> <capability type='mdev'> >>> <type id='11'/> >>> <uuid>0695d332-7831-493f-9e71-1c85c8911a08</uuid> >>> <group>group1</group> >>> </capability> >>> </device> >>> >>> (should group also be a UUID?) >> I replied to earlier mail too, group number doesn't need to be UUID. It should be a unique number. I think in the discussion in bof someone mentioned about using domain's unique number that libvirt generates. That should also work. >> As long as create_group handles all the work and all libvirt does is >> call it, get the return status/error, and handle deleting the vGPU on >> error, then I guess it's doable. >> Yes that is the idea. Libvirt doesn't have to care about the groups. With Alex's proposal, as you mentioned above, libvirt have to provide group number to mdev_create, check return status and handle error case. echo $UUID1:$GROUP1 > mdev_create echo $UUID2:$GROUP1 > mdev_create would create two mdev devices assigned to same domain. >> Alternatively having multiple <type id='#'> in the XML and performing a >> single *mdev/create_group is an option. > > I don't really like the idea of a single nodedev-create creating > multiple devices, but that would work too. > >> That is, what is the "output" from create_group that gets added to the >> domain XML? How is that found? > > A new sysfs path is created, whose name depends on the UUID. The UUID > is used in a <hostdev> element in the domain XML and the sysfs path > appears in the QEMU command line. Kirti and Neo had examples in their > presentation at KVM Forum. > > If you create multiple devices in the same group, they are added to the > same IOMMU group so they must be used by the same VM. However they > don't have to be available from the beginning; they could be > hotplugged/hot-unplugged later, since from the point of view of the VM > those are just another PCI device. > >> Also, once the domain is running can a >> vGPU be added to the group? Removed? What allows/prevents? > > Kirti?... :) Yes, vGPU could be hot-plugged or hot-unplugged. This also depends on does vendor driver want to support that. For example, domain is running with two vGPUs $UUID1 and $UUID2 and user tried to hot-unplug vGPU $UUID2, vendor driver knows that domain is running and vGPU is being used in guest, so vendor driver can fail offline/close() call if they don't support hot-unplug. Similarly for hot-plug vendor driver can fail create call to not to support hot-plug. > > In principle I don't think anything should block vGPUs from different > groups being added to the same VM, but I have to defer to Alex and Kirti > again on this. > No, there should be one group per VM. >>> Since John brought up the topic of minimal XML, in this case it will be >>> like this: >>> >>> <device> >>> <name>my-vgpu</name> >>> <parent>pci_0000_86_00_0</parent> >>> <capability type='mdev'> >>> <type id='11'/> >>> </capability> >>> </device> >>> >>> The uuid will be autogenerated by libvirt and if there's no <group> (as >>> is common for VMs with only 1 vGPU) it will be a single-device group. >> >> The <name> could be ignored as it seems existing libvirt code wants to >> generate a name via udevGenerateDeviceName for other devices. I haven't >> studied it long enough, but I believe that's how those pci_####* names >> created. > > Yeah that makes sense. So we get down to a minimal XML that has just > parent, and capability with type in it; additional elements could be > name (ignored anyway), and within capability uuid and group. > Yes, this seems good. I would like to have one more capability here. Pulling here some suggestion from my previous mail: In the directory structure, a 'params' can take optional parameters. Libvirt then can set 'params' and then create mdev device. For example, param say 'disable_console_vnc=1' is set for type 11, then devices created of type 11 will have that param set unless it is cleared. └── mdev_supported_types ├── 11 │ ├── create │ ├── description │ └── max_instances │ └── params ├── 12 │ ├── create │ ├── description │ └── max_instances │ └── params └── 13 ├── create ├── description └── max_instances └── params So with that XML format would be: <device> <name>my-vgpu</name> <parent>pci_0000_86_00_0</parent> <capability type='mdev'> <type id='11'/> <group>group1</group> <params>disable_console_vnc=1</params> </capability> </device> and 'params' field should be just a string to libvirt and its optional also. If user want to provide extra parameter while creating vGPU device they should provide it in XML file as above to nodedev-create. Very initial proposal was to have this extra paramter list as a string to mdev_create itself as: echo $UUID1:$PARAMS > mdev_create I would like to know others opinions on whether it should be part of mdev_create input or a separate write to 'params' file in sysfs as in above directory structure. Kirti. > Thanks, > > Paolo > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html