Re: [PATCH 1/3] KVM: x86/mmu: Dynamically allocate shadow MMU's hashed page list

Vipin Sharma <vipinsh@xxxxxxxxxx> · Mon, 17 Mar 2025 09:37:32 -0700

On 2025-03-14 19:40:08, Sean Christopherson wrote:
> Dynamically allocate the (massive) array of hashed lists used to track
> shadow pages, as the array itself is 32KiB, i.e. is an order-3 allocation
> all on its own, and is *exactly* an order-3 allocation.  Dynamically
> allocating the array will allow allocating "struct kvm" using regular
> kmalloc(), and will also allow deferring allocation of the array until
> it's actually needed, i.e. until the first shadow root is allocated.
> 
> Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> ---
>  arch/x86/include/asm/kvm_host.h |  4 ++--
>  arch/x86/kvm/mmu/mmu.c          | 21 ++++++++++++++++++++-
>  arch/x86/kvm/x86.c              |  5 ++++-
>  3 files changed, 26 insertions(+), 4 deletions(-)
> 
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -6673,13 +6685,19 @@ static void kvm_mmu_zap_all_fast(struct kvm *kvm)
>  		kvm_tdp_mmu_zap_invalidated_roots(kvm, true);
>  }
>  
> -void kvm_mmu_init_vm(struct kvm *kvm)
> +int kvm_mmu_init_vm(struct kvm *kvm)
>  {
> +	int r;
> +
>  	kvm->arch.shadow_mmio_value = shadow_mmio_value;
>  	INIT_LIST_HEAD(&kvm->arch.active_mmu_pages);
>  	INIT_LIST_HEAD(&kvm->arch.possible_nx_huge_pages);
>  	spin_lock_init(&kvm->arch.mmu_unsync_pages_lock);
>  
> +	r = kvm_mmu_alloc_page_hash(kvm);
> +	if (r)
> +		return r;
> +

In the patch 3, shouldn't this be moved to else part of the below 
'if (tdp_mmu_enabled)' line? Otherwise, this hash array will always get
allocated.

>  	if (tdp_mmu_enabled)
>  		kvm_mmu_init_tdp_mmu(kvm);
>  
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -12704,7 +12704,9 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>  	if (ret)
>  		goto out;
>  
> -	kvm_mmu_init_vm(kvm);
> +	ret = kvm_mmu_init_vm(kvm);
> +	if (ret)
> +		goto out_cleanup_page_track;
>  
>  	ret = kvm_x86_call(vm_init)(kvm);
>  	if (ret)
> @@ -12757,6 +12759,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>  
>  out_uninit_mmu:
>  	kvm_mmu_uninit_vm(kvm);
> +out_cleanup_page_track:

I think there is a memory leak in this series.

1. kvm_mmu_uninit_vm() is not freeing kvm->arch.mmu_page_hash. So, in
error case out_uninit_mmu will not recover memory allocated in
kvm_mmu_alloc_page_hash().

2. When VM terminates or is killed then the same thing will happen, no
one is reclaiming the memory.

>  	kvm_page_track_cleanup(kvm);
>  out:
>  	return ret;