On Tue, Jan 04, 2022, Paolo Bonzini wrote: > On 12/29/21 14:13, Yang Zhong wrote: > > Highly appreciate for your review. This version mostly addressed the comments > > from Sean. Most comments are adopted except three which are not closed and > > need more discussions: > > > > - Move the entire xfd write emulation code to x86.c. Doing so requires > > introducing a new kvm_x86_ops callback to disable msr write bitmap. > > According to Paolo's earlier comment he prefers to handle it in vmx.c. > > Yes, I do. No objection, my comments were prior to seeing the patches that manipulated the bitmap, e.g. in the earlier patches, having anything in vmx.c is unnecessary. > > - Directly check msr_bitmap in update_exception_bitmap() (for > > trapping #NM) and vcpu_enter_guest() (for syncing guest xfd after > > vm-exit) instead of introducing an extra flag in the last patch. However, > > doing so requires another new kvm_x86_ops callback for checking > > msr_bitmap since vcpu_enter_guest() is x86 common code. Having an > > extra flag sounds simpler here (at least for the initial AMX support). > > It does penalize nested guest with one xfd sync per exit, but it's not > > worse than a normal guest which initializes xfd but doesn't run > > AMX applications at all. Those could be improved afterwards. > > The thing to do here would be to move > MAX_POSSIBLE_PASSTHROUGH_MSRS/MAX_DIRECT_ACCESS_MSRS from VMX/SVM to core > code. For now we can keep the flag. > > > - Disable #NM trap for nested guest. This version still chooses to always > > trap #NM (regardless in L1 or L2) as long as xfd write interception is disabled. > > In reality #NM is rare if nested guest doesn't intend to run AMX applications > > and always-trap is safer than dynamic trap for the basic support in case > > of any oversight here. > > Sean was justifying this with lack of support for nested AMX, but I'm not > sure actually what is missing at all. That is, an L1 hypervisor could > expose AMX to L2, and then an L2->L0->L2 exit/reentry would have to trap > #NM. Otherwise it would miss an XFD_ERR update. Ya, I was assuming there was something L0 needed to do to supported nested AMX, but as Paolo pointed out there are no VMCS bits, so L0 just needs to correctly handle #NM and MSR interceptions according to vmcs12.