RE: [PATCH v4 00/21] AMX Support in KVM

"Tian, Kevin" <kevin.tian@xxxxxxxxx> · Wed, 5 Jan 2022 00:10:38 +0000

> From: Sean Christopherson <seanjc@xxxxxxxxxx>
> Sent: Wednesday, January 5, 2022 2:55 AM
> 
> On Tue, Jan 04, 2022, Paolo Bonzini wrote:
> > On 12/29/21 14:13, Yang Zhong wrote:
> > > Highly appreciate for your review. This version mostly addressed the
> comments
> > > from Sean. Most comments are adopted except three which are not
> closed and
> > > need more discussions:
> > >
> > >    - Move the entire xfd write emulation code to x86.c. Doing so requires
> > >      introducing a new kvm_x86_ops callback to disable msr write bitmap.
> > >      According to Paolo's earlier comment he prefers to handle it in vmx.c.
> >
> > Yes, I do.
> 
> No objection, my comments were prior to seeing the patches that
> manipulated the
> bitmap, e.g. in the earlier patches, having anything in vmx.c is unnecessary.
> 
> > >    - Directly check msr_bitmap in update_exception_bitmap() (for
> > >      trapping #NM) and vcpu_enter_guest() (for syncing guest xfd after
> > >      vm-exit) instead of introducing an extra flag in the last patch. However,
> > >      doing so requires another new kvm_x86_ops callback for checking
> > >      msr_bitmap since vcpu_enter_guest() is x86 common code. Having an
> > >      extra flag sounds simpler here (at least for the initial AMX support).
> > >      It does penalize nested guest with one xfd sync per exit, but it's not
> > >      worse than a normal guest which initializes xfd but doesn't run
> > >      AMX applications at all. Those could be improved afterwards.
> >
> > The thing to do here would be to move
> > MAX_POSSIBLE_PASSTHROUGH_MSRS/MAX_DIRECT_ACCESS_MSRS from
> VMX/SVM to core
> > code.  For now we can keep the flag.

sounds good.

> >
> > >    - Disable #NM trap for nested guest. This version still chooses to always
> > >      trap #NM (regardless in L1 or L2) as long as xfd write interception is
> disabled.
> > >      In reality #NM is rare if nested guest doesn't intend to run AMX
> applications
> > >      and always-trap is safer than dynamic trap for the basic support in
> case
> > >      of any oversight here.
> >
> > Sean was justifying this with lack of support for nested AMX, but I'm not
> > sure actually what is missing at all.  That is, an L1 hypervisor could
> > expose AMX to L2, and then an L2->L0->L2 exit/reentry would have to trap
> > #NM.  Otherwise it would miss an XFD_ERR update.
> 
> Ya, I was assuming there was something L0 needed to do to supported
> nested AMX,
> but as Paolo pointed out there are no VMCS bits, so L0 just needs to correctly
> handle #NM and MSR interceptions according to vmcs12.

btw Sean still made a good point on exception queuing part. Current 
version blindly queues a #NM even when L1 wants to intercept #NM
itself. We had that fixed internally and will send out a new version
very soon.

Thanks
Kevin