On Tue, May 11, 2021, Ben Gardon wrote: > If the TDP MMU is in use, wait to allocate the rmaps until the shadow > MMU is actually used. (i.e. a nested VM is launched.) This saves memory > equal to 0.2% of guest memory in cases where the TDP MMU is used and > there are no nested guests involved. > > Signed-off-by: Ben Gardon <bgardon@xxxxxxxxxx> > --- > arch/x86/include/asm/kvm_host.h | 2 ++ > arch/x86/kvm/mmu/mmu.c | 53 +++++++++++++++++++++++---------- > arch/x86/kvm/mmu/tdp_mmu.c | 6 ++-- > arch/x86/kvm/mmu/tdp_mmu.h | 4 +-- > arch/x86/kvm/x86.c | 45 +++++++++++++++++++++++++++- > 5 files changed, 89 insertions(+), 21 deletions(-) > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > index fc75ed49bfee..7b65f82ade1c 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -1868,4 +1868,6 @@ static inline int kvm_cpu_get_apicid(int mps_cpu) > > int kvm_cpu_dirty_log_size(void); > > +int alloc_all_memslots_rmaps(struct kvm *kvm); > + > #endif /* _ASM_X86_KVM_HOST_H */ > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > index b0bdb924d519..183afccd2944 100644 > --- a/arch/x86/kvm/mmu/mmu.c > +++ b/arch/x86/kvm/mmu/mmu.c > @@ -1190,7 +1190,8 @@ static void kvm_mmu_write_protect_pt_masked(struct kvm *kvm, > kvm_tdp_mmu_clear_dirty_pt_masked(kvm, slot, > slot->base_gfn + gfn_offset, mask, true); > > - if (!kvm->arch.memslots_have_rmaps) > + /* Read memslots_have_rmaps before the rmaps themselves */ IIRC, you open coded reading memslots_have_rmaps because of a circular dependency, but you can solve that simply by defining the helper in mmu.h instead of kvm_host.h. And I think you could even make it static in mmu.c and omit the smp_load_acuquire from the three users in x86.c, though that's probably not worth it. Either way, reading the same comment over and over and over, just to make checkpatch happy, gets more than a bit tedious. That would also allow you to elaborate on why the smp_load_acquire() is necessary, and preferably what it pairs with. > + if (!smp_load_acquire(&kvm->arch.memslots_have_rmaps)) > return; > > while (mask) {