Hi Jason, > From: Jason Wang <jasowang@xxxxxxxxxx> > Sent: Thursday, December 5, 2019 12:41 AM > > > On 2019/12/5 下午2:06, Zhenyu Wang wrote: > > On 2019.12.04 17:36:12 +0000, Parav Pandit wrote: > >> + Jiri + Netdev since you mentioned netdev queue. > >> > >> + Jason Wang and Michael as we had similar discussion in vdpa discussion > thread. > >> > >>> From: Zhenyu Wang <zhenyuw@xxxxxxxxxxxxxxx> > >>> Sent: Friday, November 8, 2019 2:19 AM > >>> To: Parav Pandit <parav@xxxxxxxxxxxx> > >>> > >> My apologies to reply late. > >> Something bad with my email client, due to which I found this patch under > spam folder today. > >> More comments below. > >> > >>> On 2019.11.07 20:37:49 +0000, Parav Pandit wrote: > >>>> Hi, > >>>> > >>>>> -----Original Message----- > >>>>> From: kvm-owner@xxxxxxxxxxxxxxx <kvm-owner@xxxxxxxxxxxxxxx> On > >>>>> Behalf Of Zhenyu Wang > >>>>> Sent: Thursday, October 24, 2019 12:08 AM > >>>>> To: kvm@xxxxxxxxxxxxxxx > >>>>> Cc: alex.williamson@xxxxxxxxxx; kwankhede@xxxxxxxxxx; > >>>>> kevin.tian@xxxxxxxxx; cohuck@xxxxxxxxxx > >>>>> Subject: [PATCH 0/6] VFIO mdev aggregated resources handling > >>>>> > >>>>> Hi, > >>>>> > >>>>> This is a refresh for previous send of this series. I got > >>>>> impression that some SIOV drivers would still deploy their own > >>>>> create and config method so stopped effort on this. But seems this > >>>>> would still be useful for some other SIOV driver which may simply > >>>>> want capability to aggregate resources. So here's refreshed series. > >>>>> > >>>>> Current mdev device create interface depends on fixed mdev type, > >>>>> which get uuid from user to create instance of mdev device. If > >>>>> user wants to use customized number of resource for mdev device, > >>>>> then only can create new > >>>> Can you please give an example of 'resource'? > >>>> When I grep [1], [2] and [3], I couldn't find anything related to ' > aggregate'. > >>> The resource is vendor device specific, in SIOV spec there's ADI > >>> (Assignable Device Interface) definition which could be e.g queue > >>> for net device, context for gpu, etc. I just named this interface as > 'aggregate' > >>> for aggregation purpose, it's not used in spec doc. > >>> > >> Some 'unknown/undefined' vendor specific resource just doesn't work. > >> Orchestration tool doesn't know which resource and what/how to configure > for which vendor. > >> It has to be well defined. > >> > >> You can also find such discussion in recent lgpu DRM cgroup patches series > v4. > >> > >> Exposing networking resource configuration in non-net namespace aware > mdev sysfs at PCI device level is no-go. > >> Adding per file NET_ADMIN or other checks is not the approach we follow in > kernel. > >> > >> devlink has been a subsystem though under net, that has very rich interface > for syscaller, device health, resource management and many more. > >> Even though it is used by net driver today, its written for generic device > management at bus/device level. > >> > >> Yuval has posted patches to manage PCI sub-devices [1] and updated version > will be posted soon which addresses comments. > >> > >> For any device slice resource management of mdev, sub-function etc, we > should be using single kernel interface as devlink [2], [3]. > >> > >> [1] > >> https://lore.kernel.org/netdev/1573229926-30040-1-git-send-email-yuva > >> lav@xxxxxxxxxxxx/ [2] > >> http://man7.org/linux/man-pages/man8/devlink-dev.8.html > >> [3] http://man7.org/linux/man-pages/man8/devlink-resource.8.html > >> > >> Most modern device configuration that I am aware of is usually done via > well defined ioctl() of the subsystem (vhost, virtio, vfio, rdma, nvme and more) > or via netlink commands (net, devlink, rdma and more) not via sysfs. > >> > > Current vfio/mdev configuration is via documented sysfs ABI instead of > > other ways. So this adhere to that way to introduce more configurable > > method on mdev device for standard, it's optional and not actually > > vendor specific e.g vfio-ap. > > > > I'm not sure how many devices support devlink now, or if really make > > sense to utilize devlink for other devices except net, or if really > > make sense to take mdev resource configuration from there... > > > It may make sense to allow other types of API to manage mdev other than > sysfs. But I'm not sure whether or not it will be a challenge for orchestration. > There are two parts. 1. How you specify resource config (sysfs/netlink/devlink/ioctl etc) 2. definition of the resource itself. It has to be well defined. Or it should be categorized as miscellaneous. It cannot be some undefined/vague name as 'aggregate'. > Thanks > > > >>>>> mdev type for that which may not be flexible. This requirement > >>>>> comes not only from to be able to allocate flexible resources for > >>>>> KVMGT, but also from Intel scalable IO virtualization which would > >>>>> use vfio/mdev to be able to allocate arbitrary resources on mdev > instance. > >>> More info on [1] [2] [3]. > >>>>> To allow to create user defined resources for mdev, it trys to > >>>>> extend mdev create interface by adding new "aggregate=xxx" > >>>>> parameter following UUID, for target mdev type if aggregation is > >>>>> supported, it can create new mdev device which contains resources > >>>>> combined by number of instances, e.g > >>>>> > >>>>> echo "<uuid>,aggregate=10" > create > >>>>> > >>>>> VM manager e.g libvirt can check mdev type with "aggregation" > >>>>> attribute which can support this setting. If no "aggregation" > >>>>> attribute found for mdev type, previous behavior is still kept for > >>>>> one instance allocation. And new sysfs attribute > >>>>> "aggregated_instances" is created for each mdev device to show > >>>>> allocated > >>> number. > >>>>> References: > >>>>> [1] > >>>>> https://software.intel.com/en-us/download/intel-virtualization-tec > >>>>> hn > >>>>> ology- for-directed-io-architecture-specification > >>>>> [2] > >>>>> https://software.intel.com/en-us/download/intel-scalable-io-virtua > >>>>> li > >>>>> zation- > >>>>> technical-specification > >>>>> [3] https://schd.ws/hosted_files/lc32018/00/LC3-SIOV-final.pdf > >>>>> > >>>>> Zhenyu Wang (6): > >>>>> vfio/mdev: Add new "aggregate" parameter for mdev create > >>>>> vfio/mdev: Add "aggregation" attribute for supported mdev type > >>>>> vfio/mdev: Add "aggregated_instances" attribute for supported mdev > >>>>> device > >>>>> Documentation/driver-api/vfio-mediated-device.rst: Update for > >>>>> vfio/mdev aggregation support > >>>>> Documentation/ABI/testing/sysfs-bus-vfio-mdev: Update for vfio/mdev > >>>>> aggregation support > >>>>> drm/i915/gvt: Add new type with aggregation support > >>>>> > >>>>> Documentation/ABI/testing/sysfs-bus-vfio-mdev | 24 ++++++ > >>>>> .../driver-api/vfio-mediated-device.rst | 23 ++++++ > >>>>> drivers/gpu/drm/i915/gvt/gvt.c | 4 +- > >>>>> drivers/gpu/drm/i915/gvt/gvt.h | 11 ++- > >>>>> drivers/gpu/drm/i915/gvt/kvmgt.c | 53 ++++++++++++- > >>>>> drivers/gpu/drm/i915/gvt/vgpu.c | 56 ++++++++++++- > >>>>> drivers/vfio/mdev/mdev_core.c | 36 ++++++++- > >>>>> drivers/vfio/mdev/mdev_private.h | 6 +- > >>>>> drivers/vfio/mdev/mdev_sysfs.c | 79 ++++++++++++++++++- > >>>>> include/linux/mdev.h | 19 +++++ > >>>>> 10 files changed, 294 insertions(+), 17 deletions(-) > >>>>> > >>>>> -- > >>>>> 2.24.0.rc0 > >>> -- > >>> Open Source Technology Center, Intel ltd. > >>> > >>> $gpg --keyserver wwwkeys.pgp.net --recv-keys 4D781827