On Fri, Feb 28, 2025 at 09:06:18PM -0500, Maxim Levitsky wrote: > On Wed, 2025-02-05 at 18:23 +0000, Yosry Ahmed wrote: > > Handle L1's requests to flush L2's TLB through the TLB_CONTROL field of > > VMCB12. This is currently redundant because a full flush is executed on > > every nested transition, but is a step towards removing that. > > > > TLB_CONTROL_FLUSH_ALL_ASID flushes all ASIDs from L1's perspective, > > including its own, so do a guest TLB flush on both transitions. Never > > propagate TLB_CONTROL_FLUSH_ALL_ASID from the guest to the actual VMCB, > > as this gives the guest the power to flush the entire physical TLB > > (including translations for the host and other VMs). > > > > For other cases, the TLB flush is only done when entering L2. The nested > > NPT MMU is also sync'd because TLB_CONTROL also flushes NPT > > guest-physical mappings. > > > > All TLB_CONTROL values can be handled by KVM regardless of FLUSHBYASID > > support on the underlying CPU, so keep advertising FLUSHBYASID to the > > guest unconditionally. > > > > Signed-off-by: Yosry Ahmed <yosry.ahmed@xxxxxxxxx> > > --- > > arch/x86/kvm/svm/nested.c | 42 ++++++++++++++++++++++++++++++++------- > > arch/x86/kvm/svm/svm.c | 6 +++--- > > 2 files changed, 38 insertions(+), 10 deletions(-) > > > > diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c > > index 0735177b95a1d..e2c59eb2907e8 100644 > > --- a/arch/x86/kvm/svm/nested.c > > +++ b/arch/x86/kvm/svm/nested.c > > @@ -484,19 +484,36 @@ static void nested_save_pending_event_to_vmcb12(struct vcpu_svm *svm, > > > > static void nested_svm_entry_tlb_flush(struct kvm_vcpu *vcpu) > > { > > + struct vcpu_svm *svm = to_svm(vcpu); > > + > > /* Handle pending Hyper-V TLB flush requests */ > > kvm_hv_nested_transtion_tlb_flush(vcpu, npt_enabled); > > > > + /* > > + * If L1 requested a TLB flush for L2, flush L2's TLB on nested entry > > + * and sync the nested NPT MMU, as TLB_CONTROL also flushes NPT > > + * guest-physical mappings. We technically only need to flush guest_mode > > + * page tables. > > + * > > + * KVM_REQ_TLB_FLUSH_GUEST will flush L2's ASID even if the underlying > > + * CPU does not support FLUSHBYASID (by assigning a new ASID), so we > > + * can handle all TLB_CONTROL values from L1 regardless. > > + * > > + * Note that TLB_CONTROL_FLUSH_ASID_LOCAL is handled exactly like > > + * TLB_CONTROL_FLUSH_ASID. We can technically flush less TLB entries, > > + * but this would require significantly more complexity. > > + */ > > I think it might make sense to note that we in essence support only one non zero ASID > in L1, the one that it picks for the nested guest. > > > Thus when asked to TLB_CONTROL_FLUSH_ALL_ASID > we need to flush the L2's real asid and L1 asid only. This is described by the comment in nested_svm_exit_tlb_flush(). Do you mean that we should also mention that here? I guess one way to make things clearer is to describe the behavior for all values of TLB_CONTROL here, and in nested_svm_exit_tlb_flush() just say /* see nested_svm_entry_tlb_flush() */. Would that improve things? > > > > + if (svm->nested.ctl.tlb_ctl != TLB_CONTROL_DO_NOTHING) { > > + if (nested_npt_enabled(svm)) > > + kvm_make_request(KVM_REQ_MMU_SYNC, vcpu); > > + kvm_make_request(KVM_REQ_TLB_FLUSH_GUEST, vcpu); > > + } > > + > > /* > > * TODO: optimize unconditional TLB flush/MMU sync. A partial list of > > * things to fix before this can be conditional: > > * > > - * - Honor L1's request to flush an ASID on nested VMRUN > > - * - Sync nested NPT MMU on VMRUN that flushes L2's ASID[*] > > * - Don't crush a pending TLB flush in vmcb02 on nested VMRUN > > - * > > - * [*] Unlike nested EPT, SVM's ASID management can invalidate nested > > - * NPT guest-physical mappings on VMRUN. > > */ > > kvm_make_request(KVM_REQ_MMU_SYNC, vcpu); > > kvm_make_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu); > > @@ -504,9 +521,18 @@ static void nested_svm_entry_tlb_flush(struct kvm_vcpu *vcpu) > > > > static void nested_svm_exit_tlb_flush(struct kvm_vcpu *vcpu) > > { > > + struct vcpu_svm *svm = to_svm(vcpu); > > + > > /* Handle pending Hyper-V TLB flush requests */ > > kvm_hv_nested_transtion_tlb_flush(vcpu, npt_enabled); > > > > + /* > > + * If L1 had requested a full TLB flush when entering L2, also flush its > > + * TLB entries when exiting back to L1. > > + */ > > + if (svm->nested.ctl.tlb_ctl == TLB_CONTROL_FLUSH_ALL_ASID) > > + kvm_make_request(KVM_REQ_TLB_FLUSH_GUEST, vcpu); > > Makes sense. > > > + > > /* See nested_svm_entry_tlb_flush() */ > > kvm_make_request(KVM_REQ_MMU_SYNC, vcpu); > > kvm_make_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu); > > @@ -825,7 +851,8 @@ int enter_svm_guest_mode(struct kvm_vcpu *vcpu, u64 vmcb12_gpa, > > nested_svm_copy_common_state(svm->vmcb01.ptr, svm->nested.vmcb02.ptr); > > > > svm_switch_vmcb(svm, &svm->nested.vmcb02); > > - nested_vmcb02_prepare_control(svm, vmcb12->save.rip, vmcb12->save.cs.base); > > + nested_vmcb02_prepare_control(svm, vmcb12->save.rip, > > + vmcb12->save.cs.base); > > nested_vmcb02_prepare_save(svm, vmcb12); > > > > ret = nested_svm_load_cr3(&svm->vcpu, svm->nested.save.cr3, > > @@ -1764,7 +1791,8 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu, > > nested_copy_vmcb_control_to_cache(svm, ctl); > > > > svm_switch_vmcb(svm, &svm->nested.vmcb02); > > - nested_vmcb02_prepare_control(svm, svm->vmcb->save.rip, svm->vmcb->save.cs.base); > > + nested_vmcb02_prepare_control(svm, svm->vmcb->save.rip, > > + svm->vmcb->save.cs.base); > > > > /* > > * While the nested guest CR3 is already checked and set by > > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c > > index 8342c7eadbba8..5e7b1c9bfa605 100644 > > --- a/arch/x86/kvm/svm/svm.c > > +++ b/arch/x86/kvm/svm/svm.c > > @@ -5242,9 +5242,9 @@ static __init void svm_set_cpu_caps(void) > > kvm_cpu_cap_set(X86_FEATURE_VMCBCLEAN); > > > > /* > > - * KVM currently flushes TLBs on *every* nested SVM transition, > > - * and so for all intents and purposes KVM supports flushing by > > - * ASID, i.e. KVM is guaranteed to honor every L1 ASID flush. > > + * KVM currently handles all TLB_CONTROL values set by L1, even > > + * if the underlying CPU does not. See > > + * nested_svm_transition_tlb_flush(). > > */ > > kvm_cpu_cap_set(X86_FEATURE_FLUSHBYASID); > > > > Patch looks OK, but I can't be 100% sure about this. > > Reviewed-by: Maxim Levitsky <mlevitsk@xxxxxxxxxx> > > Best regards, > Maxim Levitsky > >