On Wed, 6 Nov 2019 11:44:40 -0700 Alex Williamson <alex.williamson@xxxxxxxxxx> wrote: > On Wed, 6 Nov 2019 12:20:31 +0800 > Zhenyu Wang <zhenyuw@xxxxxxxxxxxxxxx> wrote: > > > On 2019.11.05 14:10:42 -0700, Alex Williamson wrote: > > > On Thu, 24 Oct 2019 13:08:23 +0800 > > > Zhenyu Wang <zhenyuw@xxxxxxxxxxxxxxx> wrote: > > > > > > > Hi, > > > > > > > > This is a refresh for previous send of this series. I got impression that > > > > some SIOV drivers would still deploy their own create and config method so > > > > stopped effort on this. But seems this would still be useful for some other > > > > SIOV driver which may simply want capability to aggregate resources. So here's > > > > refreshed series. > > > > > > > > Current mdev device create interface depends on fixed mdev type, which get uuid > > > > from user to create instance of mdev device. If user wants to use customized > > > > number of resource for mdev device, then only can create new mdev type for that > > > > which may not be flexible. This requirement comes not only from to be able to > > > > allocate flexible resources for KVMGT, but also from Intel scalable IO > > > > virtualization which would use vfio/mdev to be able to allocate arbitrary > > > > resources on mdev instance. More info on [1] [2] [3]. > > > > > > > > To allow to create user defined resources for mdev, it trys to extend mdev > > > > create interface by adding new "aggregate=xxx" parameter following UUID, for > > > > target mdev type if aggregation is supported, it can create new mdev device > > > > which contains resources combined by number of instances, e.g > > > > > > > > echo "<uuid>,aggregate=10" > create > > > > > > > > VM manager e.g libvirt can check mdev type with "aggregation" attribute which > > > > can support this setting. If no "aggregation" attribute found for mdev type, > > > > previous behavior is still kept for one instance allocation. And new sysfs > > > > attribute "aggregated_instances" is created for each mdev device to show allocated number. > > > > > > Given discussions we've had recently around libvirt interacting with > > > mdev, I think that libvirt would rather have an abstract interface via > > > mdevctl[1]. Therefore can you evaluate how mdevctl would support this > > > creation extension? It seems like it would fit within the existing > > > mdev and mdevctl framework if aggregation were simply a sysfs attribute > > > for the device. For example, the mdevctl steps might look like this: > > > > > > mdevctl define -u UUID -p PARENT -t TYPE > > > mdevctl modify -u UUID --addattr=mdev/aggregation --value=2 > > > mdevctl start -u UUID > > > > > > When mdevctl starts the mdev, it will first create it using the > > > existing mechanism, then apply aggregation attribute, which can consume > > > the necessary additional instances from the parent device, or return an > > > error, which would unwind and return a failure code to the caller > > > (libvirt). I think the vendor driver would then have freedom to decide > > > when the attribute could be modified, for instance it would be entirely > > > reasonable to return -EBUSY if the user attempts to modify the > > > attribute while the mdev device is in-use. Effectively aggregation > > > simply becomes a standardized attribute with common meaning. Thoughts? > > > [cc libvirt folks for their impression] Thanks, > > > > I think one problem is that before mdevctl start to create mdev you > > don't know what vendor attributes are, as we apply mdev attributes > > after create. You may need some lookup depending on parent.. I think > > making aggregation like other vendor attribute for mdev might be the > > simplest way, but do we want to define its behavior in formal? e.g > > like previous discussed it should show maxium instances for aggregation, etc. > > Yes, we'd still want to standardize how we enable and discover > aggregation since we expect multiple users. Even if libvirt were to > use mdevctl as it's mdev interface, higher level tools should have an > introspection mechanism available. Possibly the sysfs interfaces > proposed in this series remains largely the same, but I think perhaps > the implementation of them moves out to the vendor driver. In fact, > perhaps the only change to mdev core is to define the standard. For > example, the "aggregation" attribute on the type is potentially simply > a defined, optional, per type attribute, similar to "name" and > "description". For "aggregated_instances" we already have the > mdev_attr_groups of the mdev_parent_ops, we could define an > attribute_group with .name = "mdev" as a set of standardized > attributes, such that vendors could provide both their own vendor > specific attributes and per device attributes with a common meaning and > semantic defined in the mdev ABI. +1 to standardizing this. While not every vendor driver will support aggregation, providing a common infrastructure to ensure those that do use the same approach is a good idea. > > > The behavior change for driver is that previously aggregation is > > handled at create time, but for sysfs attr it should handle any > > resource allocation before it's really in-use. I think some SIOV > > driver which already requires some specific config should be ok, > > but not sure for other driver which might not be explored in this before. > > Would that be a problem? Kevin? > > Right, I'm assuming the aggregation could be modified until the device > is actually opened, the driver can nak the aggregation request by > returning an errno to the attribute write. I'm trying to anticipate > whether this introduces new complications, for instances races with > contiguous allocations. I think these seem solvable within the vendor > drivers, but please note it if I'm wrong. Thanks, > > Alex FWIW, the ap driver does this post-creation configuration stuff already. The intended workflow is create->add adapters/domains->start vm with assigned device. Do we want to do some standardization as to how post-creation configuration is supposed to work (like, at which point in time is it fine to manipulate the attribute)? I'm not sure how much of this is vendor-driver specific.