Re: [PATCH v4 0/6] VFIO mdev aggregated resources handling

Halil Pasic <pasic@xxxxxxxxxxxxx> · Tue, 18 Dec 2018 19:05:20 +0100

On Fri, 14 Dec 2018 11:28:36 -0700
Alex Williamson <alex.williamson@xxxxxxxxxx> wrote:

[..] 

> > 
> > In my opinion, bringing in this assumption that 'the parent device
> > provides <instances> homogeneous and functionally interchangeable
> > resources, and when one does a successful create one is allocated and
> > assigned to the created mdev device' was a needless limitation in the
> > first place. I.e. it does not translate at all to vfio-ap.
> 
> Instances is meant to convey limitation, not impose them.  Most mdev
> devices are fundamentally some sort of partitioning of a physical
> device, though it is possible to have purely virtual devices.  In the
> former case, various types can compose the resources in different ways
> and it's useful to have a mechanism to expose to the user, in an
> abstract way, how many of each type of device can be supported by the
> parent device in order to understand the constraints.  For example a
> GPU may only support a single "heavy weight" vGPU or (ex.) four
> "lightweight" vGPUs and a user would want to know this for their
> capacity planning.

I understand, available_instances is a very abstract way to convey
resource demand and supply for a type. AFAIU it ain't as much about
programming against, but rather about being informative. What I mean,
checking available_instances before doing a create is probably not very
helpful, as there is no guarantee that some other create won't snatch
away the 'available_instances', or the other way around.

> 
> There are corner cases with available_instances though, for instance a
> virtual device could have no fixed limit.  Perhaps reporting -1 for
> available instances would be a solution there.  For vfio-ap, it does
> seem that there's a limit based on the hardware, but I'll grant that
> it's somewhat difficult to calculate how and when to decrement
> available_instances based on how many intersections are created in the
> matrix.  Still, I think the usefulness for the general case outweighs
> the awkwardness for others but maybe we can re-evaluate if we should
> formally define something like -1 if it's not applicable.
> 

My intuition was, and still is, that the partitioned resources (and their
supply and demand) are better modeled by the vendor driver, at the
granularity level the vendor driver sees appropriate.

I do understand the benefit of having a common abstraction. In this case
however, I'm more on the side that this abstraction is costly.

> > Thinking abstract OOP, mdev devices are just objects of some type (or
> > with
> > other words object class). Create corresponds to the constructor.
> > Furthermore mdev is a proponent of the RAII (resource acquisition is
> > initialization) pattern -- I like RAII. Now the fact that type of the
> > object created  determines every relevant aspect of the resources
> > acquired is IMHO limiting. This series tries to mitigate that
> > problem by introduces aggregate=N with the semantic 'Nay, don't
> > acquire resources as the type would normally imply, but multiply
> > what is normal with N -- kind of.' This is an optional mechanism,
> > with optional facilities.
> > 
> > The bulk of the code is added to core mdev, because aggregate is
> > although optional but nevertheless core feature of mdev.
> > 
> > Clearly aggregate makes no sense for vfio-ccw nor for vfio-ap: for
> > the latter, the identity of the backing resources is crucial. But
> > vfio-ap could, at least theoretically, benefit from a create
> > parameter. Currently we acquire resources after the mdev device was
> > created, because in the context of create there was no means to tell
> > what resources are to be acquired.
> > 
> > Coming from this position, I think, passing everything after the uuid
> > to the vendor driver, and letting it handle 'constructor parameters'
> > would be a more flexible option. If that flexibility is desirable,
> > that is a different matter.
> 
> Sure, there's always then a question of how we maintain consistency
> within the interface and how we re-use constructor parameters between
> parent devices.  Intel has plans for this aggregation beyond the use
> case here in i915, so I think the goal is to standardize it as a
> general mdev feature.  Having the mdev core handle this option provides
> that, but I don't think it precludes that we could pass further
> parameters to the vendor driver.  Introspection is important though, so
> having a standard mechanism for vendor drivers to expose those
> parameters would be useful.  Here the aggregation feature is self
> describing, so we'd need to figure out how to do that generically and
> how to manage the namespace to reserve options for core handling and
> avoid collisions.
> 

I agree. In fact, the namespacing issues are one reason why I argue against
the vfio-mdev global 'aggregate' constructor parameter. If we allow vfio-mdev
global 'constructor parameters' then we have to worry about namespacing.

If however we do what I've proposed, i.e. pass everything after the UUID to
the vendor driver, then it is clear: vendor's and vendor drivers manage their
namespace themselves. IMHO the core does not and will not need more that the
UUID.

I agree about the introspection as well. 

> > IMHO if mdev is supposed to be about this homogeneous and at
> > acquisition time interchangeable 'instance resources' use case, and
> > everything different is considered oddball, then the aggregate
> > 'attribute' and the mechanic behind it is kind of sound. Maybe if we
> > change the description for available_instances, the conceptual
> > discomfort I voiced in the digression can be entirely eliminated.
> 
> Maybe inconsistency in documentation or interpretation, but I'm not
> sure there's a fundamental conceptual issue there.  I don't know that
> I want to call everything else 'oddball' but maybe we do need to
> consider more broadly the idea of vendor specific constructor
> parameters.  We can't have these be implicit knowledge and we may need
> to consider what it means to the interface overall if some of them are
> required parameters.  So can we use aggregation as an initial mdev
> core/global reserved parameter and perhaps use vfio-ap options as
> example vendor specific options to help define what the interface
> should look like?
> 
> > On the other hand, if we want 'a slice of some device, where what
> > exactly is included the slice actually matters' as a first class
> > citizen of the mdev universe, then I have a bad feeling about the
> > interface change series is proposing.
> 
> When originally defining mdev types, I think we said that a type should
> be a software compatible definition from the guest driver perspective.
> If we look at the PCI space, it's sort of equivalent to a vendor:device
> ID, but an abstraction to avoid PCI specific definitions.  Still, I
> think the equivalence holds that an mdev type doesn't necessarily
> define every aspect of the resource consumption, it can instead define
> a compatibility with a vendor model which might probe specific
> resources at a lower level.
> 

I did not understand this paragraph completely, but I think I get the
gist of it.

What I wanted to say with my paragraph is, that once resources (like queues)
have persistent state, things are a bit different.

> > Thanks again for prompting me to speak up. I hope I managed to
> > formulate this as reluctant, as reluctant I feel speaking about the
> > design decision being made here.
> 
> Thanks for the feedback.  As we exchanged with vfio-ap, I do like the
> resource commitment at instantiation model.  I don't know if it's too
> late for vfio-ap, constructor parameters would at least solve the
> ability to specify the elements of the matrix consumed at
> instantiation.  Thanks,
>

They would IMHO. Regarding too late or not, I have to relay that question
to Tony (cc: Tony).

Regards,
Halil