On Fri, Feb 28, 2025 at 09:21:54PM -0500, Maxim Levitsky wrote: > On Wed, 2025-02-05 at 18:24 +0000, Yosry Ahmed wrote: > > Now that nested TLB flushes are properly tracked with a well-maintained > > separate ASID for L2 and proper handling of L1's TLB flush requests, > > drop the unconditional flushes and syncs on nested transitions. > > > > On a Milan machine, an L1 and L2 guests were booted, both with a single > > vCPU, and pinned to a single physical CPU to maximize TLB collisions. In > > this setup, the cpuid_rate microbenchmark [1] showed the following > > changes with this patch: > > > > +--------+--------+-------------------+----------------------+ > > > L0 | L1 | cpuid_rate (base) | cpuid_rate (patched) | > > +========+========+===================+======================+ > > > NPT | NPT | 256621 | 301113 (+17.3%) | > > > NPT | Shadow | 180017 | 203347 (+12.96%) | > > > Shadow | Shadow | 177006 | 189150 (+6.86%) | > > +--------+--------+-------------------+----------------------+ > > > > [1]https://lore.kernel.org/kvm/20231109180646.2963718-1-khorenko@xxxxxxxxxxxxx/ > > > > Signed-off-by: Yosry Ahmed <yosry.ahmed@xxxxxxxxx> > > --- > > arch/x86/kvm/svm/nested.c | 7 ------- > > 1 file changed, 7 deletions(-) > > > > diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c > > index 8e40ff21f7353..45a187d4c23d1 100644 > > --- a/arch/x86/kvm/svm/nested.c > > +++ b/arch/x86/kvm/svm/nested.c > > @@ -512,9 +512,6 @@ static void nested_svm_entry_tlb_flush(struct kvm_vcpu *vcpu) > > svm->nested.last_asid = svm->nested.ctl.asid; > > kvm_make_request(KVM_REQ_TLB_FLUSH_GUEST, vcpu); > > } > > - /* TODO: optimize unconditional TLB flush/MMU sync */ > > - kvm_make_request(KVM_REQ_MMU_SYNC, vcpu); > > - kvm_make_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu); > > } > > > > static void nested_svm_exit_tlb_flush(struct kvm_vcpu *vcpu) > > @@ -530,10 +527,6 @@ static void nested_svm_exit_tlb_flush(struct kvm_vcpu *vcpu) > > */ > > if (svm->nested.ctl.tlb_ctl == TLB_CONTROL_FLUSH_ALL_ASID) > > kvm_make_request(KVM_REQ_TLB_FLUSH_GUEST, vcpu); > > - > > - /* TODO: optimize unconditional TLB flush/MMU sync */ > > - kvm_make_request(KVM_REQ_MMU_SYNC, vcpu); > > - kvm_make_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu); > > } > > > > /* > > > Assuming that all previous patches are correct this one should work as well. > > However only a very heavy stress testing, including hyperv, windows guests > of various types, etc can give me confidence that there is no some ugly bug lurking > somewhere. I tried booting an L2 and running some workloads like netperf in there. I also tried booting an L3. I am planning to try and run some testing with a windows L2 guest. I am assuming this exercises the hyper-V emulation in L1, which could be interesting. I am not sure if I will be able to test more scenarios though, especially Windows as an L1 (and something else as an L2). Let me know if you have something specific in mind. > > TLB management can be very tricky, so I can't be 100% sure that I haven't missed something. > > Reviewed-by: Maxim Levitsky <mlevitsk@xxxxxxxxxx> Thanks!