> On 23 Apr 2019, at 12:39, Daniel P. Berrangé <berrange@xxxxxxxxxx> wrote: > > On Fri, Apr 19, 2019 at 04:35:04AM -0400, Yan Zhao wrote: >> device version attribute in mdev sysfs is used by user space software >> (e.g. libvirt) to query device compatibility for live migration of VFIO >> mdev devices. This attribute is mandatory if a mdev device supports live >> migration. >> >> It consists of two parts: common part and vendor proprietary part. >> common part: 32 bit. lower 16 bits is vendor id and higher 16 bits >> identifies device type. e.g., for pci device, it is >> "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16). >> vendor proprietary part: this part is varied in length. vendor driver can >> specify any string to identify a device. >> >> When reading this attribute, it should show device version string of the >> device of type <type-id>. If a device does not support live migration, it >> should return errno. >> When writing a string to this attribute, it returns errno for >> incompatibility or returns written string length in compatibility case. >> If a device does not support live migration, it always returns errno. >> >> For user space software to use: >> 1. >> Before starting live migration, user space software first reads source side >> mdev device's version. e.g. >> "#cat \ >> /sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version" >> 00028086-193b-i915-GVTg_V5_4 >> >> 2. >> Then, user space software writes the source side returned version string >> to device version attribute in target side, and checks the return value. >> If a negative errno is returned in the target side, then mdev devices in >> source and target sides are not compatible; >> If a positive number is returned and it equals to the length of written >> string, then the two mdev devices in source and target side are compatible. >> e.g. >> (a) compatibility case >> "# echo 00028086-193b-i915-GVTg_V5_4 > >> /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version" >> >> (b) incompatibility case >> "#echo 00028086-193b-i915-GVTg_V5_1 > >> /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version" >> -bash: echo: write error: Invalid argument > > What you have written here seems to imply that each mdev type is able to > support many different versions at the same time. Writing a version into > this sysfs file then chooses which of the many versions to actually use. > > This is good as it allows for live migration across driver software upgrades. > > A mgmt application may well want to know what versions are supported for an > mdev type *before* starting a migration. A mgmt app can query all the 100's > of hosts it knows and thus figure out which are valid to use as the target > of a migration. > > IOW, we want to avoid the ever hitting the incompatibility case in the > first place, by only choosing to migrate to a host that we know is going > to be compatible. > > This would need some kind of way to report the full list of supported > versions against the mdev supported types on the host. > > >> 3. if two mdev devices are compatible, user space software can start >> live migration, and vice versa. >> >> Note: if a mdev device does not support live migration, it either does >> not provide a version attribute, or always returns errno when its version >> attribute is read/written. >> >> Cc: Alex Williamson <alex.williamson@xxxxxxxxxx> >> Cc: Erik Skultety <eskultet@xxxxxxxxxx> >> Cc: "Dr. David Alan Gilbert" <dgilbert@xxxxxxxxxx> >> Cc: Cornelia Huck <cohuck@xxxxxxxxxx> >> Cc: "Tian, Kevin" <kevin.tian@xxxxxxxxx> >> Cc: Zhenyu Wang <zhenyuw@xxxxxxxxxxxxxxx> >> Cc: "Wang, Zhi A" <zhi.a.wang@xxxxxxxxx> >> Cc: Neo Jia <cjia@xxxxxxxxxx> >> Cc: Kirti Wankhede <kwankhede@xxxxxxxxxx> >> >> Signed-off-by: Yan Zhao <yan.y.zhao@xxxxxxxxx> >> --- >> Documentation/vfio-mediated-device.txt | 36 ++++++++++++++++++++++++++ >> samples/vfio-mdev/mbochs.c | 17 ++++++++++++ >> samples/vfio-mdev/mdpy.c | 16 ++++++++++++ >> samples/vfio-mdev/mtty.c | 16 ++++++++++++ >> 4 files changed, 85 insertions(+) >> >> diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt >> index c3f69bcaf96e..bc28471c0667 100644 >> --- a/Documentation/vfio-mediated-device.txt >> +++ b/Documentation/vfio-mediated-device.txt >> @@ -202,6 +202,7 @@ Directories and files under the sysfs for Each Physical Device >> | | |--- available_instances >> | | |--- device_api >> | | |--- description >> + | | |--- version >> | | |--- [devices] >> | |--- [<type-id>] >> | | |--- create >> @@ -209,6 +210,7 @@ Directories and files under the sysfs for Each Physical Device >> | | |--- available_instances >> | | |--- device_api >> | | |--- description >> + | | |--- version >> | | |--- [devices] >> | |--- [<type-id>] >> | |--- create >> @@ -216,6 +218,7 @@ Directories and files under the sysfs for Each Physical Device >> | |--- available_instances >> | |--- device_api >> | |--- description >> + | |--- version >> | |--- [devices] >> >> * [mdev_supported_types] >> @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device >> [<type-id>], device_api, and available_instances are mandatory attributes >> that should be provided by vendor driver. >> >> + version is a mandatory attribute if a mdev device supports live migration. >> + >> * [<type-id>] >> >> The [<type-id>] name is created by adding the device driver string as a prefix >> @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device >> This attribute should show the number of devices of type <type-id> that can be >> created. >> >> +* version >> + >> + This attribute is rw. It is used to check whether two devices are compatible >> + for live migration. If this attribute is missing, then the corresponding mdev >> + device is regarded as not supporting live migration. >> + >> + It consists of two parts: common part and vendor proprietary part. >> + common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies >> + device type. e.g., for pci device, it is >> + "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16). >> + vendor proprietary part: this part is varied in length. vendor driver can >> + specify any string to identify a device. >> + >> + When reading this attribute, it should show device version string of the device >> + of type <type-id>. If a device does not support live migration, it should >> + return errno. >> + When writing a string to this attribute, it returns errno for incompatibility >> + or returns written string length in compatibility case. If a device does not >> + support live migration, it always returns errno. >> + >> + for example. >> + # cat \ >> + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version >> + 00028086-193b-i915-GVTg_V5_2 >> + >> + #echo 00028086-193b-i915-GVTg_V5_2 > \ >> + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version >> + -bash: echo: write error: Invalid argument >> + > > IIUC this path is against the physical device. IOW, the mgmt app would have > to first write to the "version" file to choose a version, and then write to > the "create" file to actually create an virtual device. This has the obvious > concurrency problem if multiple devices are being created at the same time > and distinct versions for each device are required. There would need to be > a locking scheme defined to ensure safety. > > Wouldn't it be better if we can pass the desired version when we write to > the "create" file, so that we avoid any concurrent usage problems. "version" > could be just a read-only file with a *list* of supported versions. > > eg > > $ cat /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version > 5.0 > 5.1 > 5.2 > > $ echo "83b8f4f2-509f-382f-3c1e-e6bfe0fa1001;version=5.2" > > /sys/devices/virtual/mtty/mtty/mdev_supported_types/mtty-2/create > I read Alex’s comment that creating an mdev with a specific version is not the intent here. However… - If the goal is just to check compatibility with migration data, then I think the name should be more explicit, e.g. migration_version instead of version. Otherwise, I read the intent the way Daniel did. - If we want to provide a list of versions and make it possible to create instances based on a version, I would rather use a directory structure for that, i.e. (replacing cat with ls): $ ls /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version 5.0 5.1 5.2 where each version is a directory that has its own available_instances, device_api, description, create, … $ echo 83b8f4f2-509f-382f-3c1e-e6bfe0fa1001 > /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version/5.1/create In addition to not having the problem of two consecutive writes to distinct files, I can imagine contrived cases where available_instances might depend on the version… Thanks Christophe > > Regards, > Daniel > -- > |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o- https://fstop138.berrange.com :| > |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|