> From: Sean Christopherson <seanjc@xxxxxxxxxx> > Sent: Wednesday, January 5, 2022 2:55 AM > > On Tue, Jan 04, 2022, Paolo Bonzini wrote: > > On 12/29/21 14:13, Yang Zhong wrote: > > > Highly appreciate for your review. This version mostly addressed the > comments > > > from Sean. Most comments are adopted except three which are not > closed and > > > need more discussions: > > > > > > - Move the entire xfd write emulation code to x86.c. Doing so requires > > > introducing a new kvm_x86_ops callback to disable msr write bitmap. > > > According to Paolo's earlier comment he prefers to handle it in vmx.c. > > > > Yes, I do. > > No objection, my comments were prior to seeing the patches that > manipulated the > bitmap, e.g. in the earlier patches, having anything in vmx.c is unnecessary. > > > > - Directly check msr_bitmap in update_exception_bitmap() (for > > > trapping #NM) and vcpu_enter_guest() (for syncing guest xfd after > > > vm-exit) instead of introducing an extra flag in the last patch. However, > > > doing so requires another new kvm_x86_ops callback for checking > > > msr_bitmap since vcpu_enter_guest() is x86 common code. Having an > > > extra flag sounds simpler here (at least for the initial AMX support). > > > It does penalize nested guest with one xfd sync per exit, but it's not > > > worse than a normal guest which initializes xfd but doesn't run > > > AMX applications at all. Those could be improved afterwards. > > > > The thing to do here would be to move > > MAX_POSSIBLE_PASSTHROUGH_MSRS/MAX_DIRECT_ACCESS_MSRS from > VMX/SVM to core > > code. For now we can keep the flag. sounds good. > > > > > - Disable #NM trap for nested guest. This version still chooses to always > > > trap #NM (regardless in L1 or L2) as long as xfd write interception is > disabled. > > > In reality #NM is rare if nested guest doesn't intend to run AMX > applications > > > and always-trap is safer than dynamic trap for the basic support in > case > > > of any oversight here. > > > > Sean was justifying this with lack of support for nested AMX, but I'm not > > sure actually what is missing at all. That is, an L1 hypervisor could > > expose AMX to L2, and then an L2->L0->L2 exit/reentry would have to trap > > #NM. Otherwise it would miss an XFD_ERR update. > > Ya, I was assuming there was something L0 needed to do to supported > nested AMX, > but as Paolo pointed out there are no VMCS bits, so L0 just needs to correctly > handle #NM and MSR interceptions according to vmcs12. btw Sean still made a good point on exception queuing part. Current version blindly queues a #NM even when L1 wants to intercept #NM itself. We had that fixed internally and will send out a new version very soon. Thanks Kevin