On Wed, 12 Dec 2018 18:10:28 +0100 Halil Pasic <pasic@xxxxxxxxxxxxx> wrote: > On Tue, 11 Dec 2018 11:10:13 -0700 > Alex Williamson <alex.williamson@xxxxxxxxxx> wrote: > > > [cc +Halil] > > > > On Mon, 26 Nov 2018 16:48:50 +0800 > > Zhenyu Wang <zhenyuw@xxxxxxxxxxxxxxx> wrote: > > > > > Hi, > > > > > > This is respin of previous sending from > > > https://www.spinics.net/lists/kvm/msg176447.html > > > > > > Current mdev device create interface depends on fixed mdev type, > > > which get uuid from user to create instance of mdev device. If user > > > wants to use customized number of resource for mdev device, then > > > only can create new mdev type for that which may not be flexible. > > > This requirement comes not only from to be able to allocate flexible > > > resources for KVMGT, but also from Intel scalable IO virtualization > > > which would use vfio/mdev to be able to allocate arbitrary resources > > > on mdev instance. More info on [1] [2] [3]. > > > > > > To allow to create user defined resources for mdev, it trys to > > > extend mdev create interface by adding new "aggregate=xxx" parameter > > > following UUID, for target mdev type if aggregation is supported, it > > > can create new mdev device which contains resources combined by > > > number of instances, e.g > > > > > > echo "<uuid>,aggregate=10" > create > > > > > > VM manager e.g libvirt can check mdev type with "aggregation" > > > attribute which can support this setting. If no "aggregation" > > > attribute found for mdev type, previous behavior is still kept for > > > one instance allocation. And new sysfs attribute > > > "aggregated_instances" is created for each mdev device to show > > > allocated number. > > > > Halil, I hope I'm not mis-remembering, but I think you mentioned some > > apprehension of this aggregation attribute with regard to the spirit of > > mdev devices at KVM Forum. If that's correct, do you have any comments > > on this series or suggestions towards a better interface? Obviously > > this is an optional attribute targeting devices which can scale with a > > semi-arbitrary number of instances behind a single device, where > > specifying an mdev type for each enumeration of those aggregate > > instances in impractical. I don't know that there's anything similar > > in AP, but if we're missing some alignment between interfaces, please > > speak up. Thanks, > > > > Alex > > > > Hi Alex! > > Thanks for asking. First things first: I'm happy we can squeeze in with > our weird scenarios (like vfio-ap to some extent vfio-ccw) somehow, and > I'm a bit reluctant to criticize or make to generalizations coming > form those positions. > > My personal opinion is, with the mandatory attribute available_instances > we got the assumption of homogeneous and functionally interchangeable > resources represented by an mdev device of a certain mdev type into the > DNA of mdev. > > Let me quote from Documentation/vfio-mediated-device.txt > > """ > * available_instances > > This attribute should show the number of devices of type <type-id> > that can be created. > """ > <digression> > Please note it says *devices* (does not talk backing resources or > whatever). Yes, but also note that instances is an implicitly shared concept between all types. We have examples where creation of one type of device disables other types by reporting zero instances_available (ie. homogeneous type instantiation per parent device) and also device types of different weights, where an instance from one type might consume multiple instances of another type. > This series however describes "aggregation" like > > """ > * <aggregation> > > <aggregation> is an optional attributes to show max number that the > instance resources of [<type-id>] can be aggregated to be assigned > for one mdev device. No <aggregation> attribute means driver doesn't > support to aggregate instance resoures for one mdev device. > <aggregation> may be less than available_instances which depends on > driver. <aggregation> number can't exceed available_instances. > > Set number of instances by appending "aggregate=N" parameter for > create attribute. By default without "aggregate=N" parameter it > will create one instance as normal. > """ > > Please note that we talk about *instance resources* where instance is > not an mdev device instance (when aggregating we have one mdev device > instance). > </digression> > > In my opinion, bringing in this assumption that 'the parent device > provides <instances> homogeneous and functionally interchangeable > resources, and when one does a successful create one is allocated and > assigned to the created mdev device' was a needless limitation in the > first place. I.e. it does not translate at all to vfio-ap. Instances is meant to convey limitation, not impose them. Most mdev devices are fundamentally some sort of partitioning of a physical device, though it is possible to have purely virtual devices. In the former case, various types can compose the resources in different ways and it's useful to have a mechanism to expose to the user, in an abstract way, how many of each type of device can be supported by the parent device in order to understand the constraints. For example a GPU may only support a single "heavy weight" vGPU or (ex.) four "lightweight" vGPUs and a user would want to know this for their capacity planning. There are corner cases with available_instances though, for instance a virtual device could have no fixed limit. Perhaps reporting -1 for available instances would be a solution there. For vfio-ap, it does seem that there's a limit based on the hardware, but I'll grant that it's somewhat difficult to calculate how and when to decrement available_instances based on how many intersections are created in the matrix. Still, I think the usefulness for the general case outweighs the awkwardness for others but maybe we can re-evaluate if we should formally define something like -1 if it's not applicable. > Thinking abstract OOP, mdev devices are just objects of some type (or > with > other words object class). Create corresponds to the constructor. > Furthermore mdev is a proponent of the RAII (resource acquisition is > initialization) pattern -- I like RAII. Now the fact that type of the > object created determines every relevant aspect of the resources > acquired is IMHO limiting. This series tries to mitigate that problem by > introduces aggregate=N with the semantic 'Nay, don't acquire resources > as the type would normally imply, but multiply what is normal with N -- > kind of.' This is an optional mechanism, with optional facilities. > > The bulk of the code is added to core mdev, because aggregate is although > optional but nevertheless core feature of mdev. > > Clearly aggregate makes no sense for vfio-ccw nor for vfio-ap: for the > latter, the identity of the backing resources is crucial. But vfio-ap > could, at least theoretically, benefit from a create parameter. Currently > we acquire resources after the mdev device was created, because in the > context of create there was no means to tell what resources are to be > acquired. > > Coming from this position, I think, passing everything after the uuid > to the vendor driver, and letting it handle 'constructor parameters' > would be a more flexible option. If that flexibility is desirable, that > is a different matter. Sure, there's always then a question of how we maintain consistency within the interface and how we re-use constructor parameters between parent devices. Intel has plans for this aggregation beyond the use case here in i915, so I think the goal is to standardize it as a general mdev feature. Having the mdev core handle this option provides that, but I don't think it precludes that we could pass further parameters to the vendor driver. Introspection is important though, so having a standard mechanism for vendor drivers to expose those parameters would be useful. Here the aggregation feature is self describing, so we'd need to figure out how to do that generically and how to manage the namespace to reserve options for core handling and avoid collisions. > IMHO if mdev is supposed to be about this homogeneous and at acquisition > time interchangeable 'instance resources' use case, and everything > different is considered oddball, then the aggregate 'attribute' and the > mechanic behind it is kind of sound. Maybe if we change the description > for available_instances, the conceptual discomfort I voiced in the > digression can be entirely eliminated. Maybe inconsistency in documentation or interpretation, but I'm not sure there's a fundamental conceptual issue there. I don't know that I want to call everything else 'oddball' but maybe we do need to consider more broadly the idea of vendor specific constructor parameters. We can't have these be implicit knowledge and we may need to consider what it means to the interface overall if some of them are required parameters. So can we use aggregation as an initial mdev core/global reserved parameter and perhaps use vfio-ap options as example vendor specific options to help define what the interface should look like? > On the other hand, if we want 'a slice of some device, where what exactly > is included the slice actually matters' as a first class citizen of the > mdev universe, then I have a bad feeling about the interface change > series is proposing. When originally defining mdev types, I think we said that a type should be a software compatible definition from the guest driver perspective. If we look at the PCI space, it's sort of equivalent to a vendor:device ID, but an abstraction to avoid PCI specific definitions. Still, I think the equivalence holds that an mdev type doesn't necessarily define every aspect of the resource consumption, it can instead define a compatibility with a vendor model which might probe specific resources at a lower level. > Thanks again for prompting me to speak up. I hope I managed to formulate > this as reluctant, as reluctant I feel speaking about the design decision > being made here. Thanks for the feedback. As we exchanged with vfio-ap, I do like the resource commitment at instantiation model. I don't know if it's too late for vfio-ap, constructor parameters would at least solve the ability to specify the elements of the matrix consumed at instantiation. Thanks, Alex