RE: [PATCH v2 2/2] drm/i915/gvt: mdev aggregation type

"Tian, Kevin" <kevin.tian@xxxxxxxxx> · Fri, 27 Mar 2020 09:31:35 +0000

> From: Zhenyu Wang <zhenyuw@xxxxxxxxxxxxxxx>
> Sent: Friday, March 27, 2020 4:58 PM
> 
> On 2020.03.27 08:44:59 +0000, Tian, Kevin wrote:
> > > From: Zhenyu Wang <zhenyuw@xxxxxxxxxxxxxxx>
> > > Sent: Friday, March 27, 2020 4:12 PM
> > >
> > [...]
> > > > > +int intel_vgpu_adjust_resource_count(struct intel_vgpu *vgpu)
> > > > > +{
> > > > > +	if ((vgpu_aperture_sz(vgpu) != vgpu->param.low_gm_sz *
> > > > > +	     vgpu->param.aggregation) ||
> > > > > +	    (vgpu_hidden_sz(vgpu) != vgpu->param.high_gm_sz *
> > > > > +	     vgpu->param.aggregation)) {
> > > > > +		/* handle aggregation change */
> > > > > +		intel_vgpu_free_resource_count(vgpu);
> > > > > +		intel_vgpu_alloc_resource_count(vgpu);
> > > >
> > > > this logic sounds like different from the commit msg. Earlier you
> > > > said the resource is not allocated until mdev open, while the
> > > > aggregated_interfaces is only allowed to be written before
> > > > mdev open. In such case, why would it required to handle the
> > > > case where a vgpu already has resource allocated then coming
> > > > a new request to adjust the number of instances?
> > >
> > > This is to handle resource accounting before mdev open by aggregation
> > > setting change. When vgpu create from mdev type, default resource
> > > count according to type is set for vgpu. So this function is just to
> > > change resource count by aggregation.
> >
> > then better change the name, e.g. .xxx_adjust_resource_accounting,
> > otherwise it's easy to be confused.
> >
> 
> ok
> 
> > [...]
> > > > >  	if (ret)
> > > > >  		goto out_clean_vgpu_mmio;
> > > > >
> > > > > -	populate_pvinfo_page(vgpu);
> > > > > +	if (!delay_res_alloc) {
> > > > > +		ret = intel_vgpu_res_alloc(vgpu);
> > > > > +		if (ret)
> > > > > +			goto out_clean_vgpu_mmio;
> > > > > +	}
> > > >
> > > > If delayed resource allocation works correctly, why do we still
> > > > need support non-delayed flavor? Even a type doesn't support
> > > > aggregation, it doesn't hurt to do allocation until mdev open...
> > > >
> > >
> > > The difference is user expectation I think, if there's really
> > > awareness of this. As original way is to allocate at creat time, so
> > > once created success, resource is guaranteed. But for aggregation type
> > > which could be changed before open, alloc happens at that time which
> > > may have different scenario, e.g might fail caused by other instance
> > > or host. So original idea is to keep old behavior but only change for
> > > aggregation type.
> >
> > but how could one expect any difference between instant allocation
> > and delayed allocation? You already update resource accounting so
> > the remaining instances are accurate anyway. Then the user only knows
> > how the vgpu looks like when it is opened...
> >
> > >
> > > If we think this user expectation is not important, delayed alloc
> > > could help to create vgpu quickly but may have more chance to fail
> > > later..
> > >
> >
> > why? If delayed allocation has more chance to fail, it means our
> > resource accounting has problem. Even for type w/o aggregation
> > capability, we should reserve one instance resource by default anyway
> >
> 
> If under really heavy load of host and many other vgpu running, we
> might not have left continual gfx mem space..This is not new problem,
> just that now we handle it at vgpu create time to reserve the
> resource. Once host side could promise some limit, then our usage
> will be guaranteed.
> 

heavy load doesn't mean that you may have higher possibility of
securing resource at a earlier point. It's completely nondeterministic
when the situation is worse or better. In such case I don't think 
there is subtle difference between instant and delayed allocation,
while unifying on delayed allocation could simplify our code. 😊

Thanks
Kevin