Re: [PATCH] kvm: Nested KVM MMUs need PAE root too

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jiří Paleček <jpalecek@xxxxxx> writes:

> On AMD processors, in PAE 32bit mode, nested KVM instances don't
> work. The L0 host get a kernel OOPS, which is related to
> arch.mmu->pae_root being NULL.

The date on the message is "Date: Sat, 22 Jun 2019 19:42:04 +0200" and I
got a bit confused at first.

>
> The reason for this is that when setting up nested KVM instance,
> arch.mmu is set to &arch.guest_mmu (while normally, it would be
> &arch.root_mmu).

MMU switch to guest_mmu happens when we're about to start L2 guest - and
we switch back to root_mmu when we want to execute L1 again (e.g. after
vmexit) so this is not a one-time thing ('when setting up nested KVM
instance' makes me think so).

> However, the initialization and allocation of
> pae_root only creates it in root_mmu. KVM code (ie. in
> mmu_alloc_shadow_roots) then accesses arch.mmu->pae_root, which is the
> unallocated arch.guest_mmu->pae_root.

Fixes: 14c07ad89f4d ("x86/kvm/mmu: introduce guest_mmu")

>
> This fix just allocates (and frees) pae_root in both guest_mmu and
> root_mmu (and also lm_root if it was allocated). The allocation is
> subject to previous restrictions ie. it won't allocate anything on
> 64-bit and AFAIK not on Intel.

Right, it is only NPT 32 bit which uses that (and it's definitely
undertested).

>
> See bug 203923 for details.

Personally, I'd prefer this to be full link
https://bugzilla.kernel.org/show_bug.cgi?id=203923
as there are multiple bugzillas out threre.

>
> Signed-off-by: Jiri Palecek <jpalecek@xxxxxx>
> Tested-by: Jiri Palecek <jpalecek@xxxxxx>

This is weird. I always thought "Signed-off-by" implies some form of
testing (unless stated otherwise) :-)


>
> ---
>  arch/x86/kvm/mmu.c | 30 ++++++++++++++++++++++--------
>  1 file changed, 22 insertions(+), 8 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 24843cf49579..efa8285bb56d 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -5592,13 +5592,13 @@ slot_handle_leaf(struct kvm *kvm, struct kvm_memory_slot *memslot,
>  				 PT_PAGE_TABLE_LEVEL, lock_flush_tlb);
>  }
>
> -static void free_mmu_pages(struct kvm_vcpu *vcpu)
> +static void free_mmu_pages(struct kvm_mmu *mmu)
>  {
> -	free_page((unsigned long)vcpu->arch.mmu->pae_root);
> -	free_page((unsigned long)vcpu->arch.mmu->lm_root);
> +	free_page((unsigned long)mmu->pae_root);
> +	free_page((unsigned long)mmu->lm_root);
>  }
>
> -static int alloc_mmu_pages(struct kvm_vcpu *vcpu)
> +static int alloc_mmu_pages(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu)
>  {
>  	struct page *page;
>  	int i;
> @@ -5619,9 +5619,9 @@ static int alloc_mmu_pages(struct kvm_vcpu *vcpu)
>  	if (!page)
>  		return -ENOMEM;
>
> -	vcpu->arch.mmu->pae_root = page_address(page);
> +	mmu->pae_root = page_address(page);
>  	for (i = 0; i < 4; ++i)
> -		vcpu->arch.mmu->pae_root[i] = INVALID_PAGE;
> +		mmu->pae_root[i] = INVALID_PAGE;
>
>  	return 0;
>  }
> @@ -5629,6 +5629,7 @@ static int alloc_mmu_pages(struct kvm_vcpu *vcpu)
>  int kvm_mmu_create(struct kvm_vcpu *vcpu)
>  {
>  	uint i;
> +	int ret;
>
>  	vcpu->arch.mmu = &vcpu->arch.root_mmu;
>  	vcpu->arch.walk_mmu = &vcpu->arch.root_mmu;
> @@ -5646,7 +5647,19 @@ int kvm_mmu_create(struct kvm_vcpu *vcpu)
>  		vcpu->arch.guest_mmu.prev_roots[i] = KVM_MMU_ROOT_INFO_INVALID;
>
>  	vcpu->arch.nested_mmu.translate_gpa = translate_nested_gpa;
> -	return alloc_mmu_pages(vcpu);
> +
> +	ret = alloc_mmu_pages(vcpu, &vcpu->arch.guest_mmu);
> +	if (ret)
> +		return ret;
> +
> +	ret = alloc_mmu_pages(vcpu, &vcpu->arch.root_mmu);
> +	if (ret)
> +		goto fail_allocate_root;

(personal preference) here you're just jumping over 'return' so I'd
prefer this to be written as:

        ret = alloc_mmu_pages(vcpu, &vcpu->arch.root_mmu);
        if (!ret)
            return 0;
 
        free_mmu_pages(&vcpu->arch.guest_mmu);
        return ret;

and no label/goto required.

> +
> +	return ret;
> + fail_allocate_root:
> +	free_mmu_pages(&vcpu->arch.guest_mmu);
> +	return ret;
>  }
>
>  static void kvm_mmu_invalidate_zap_pages_in_memslot(struct kvm *kvm,
> @@ -6102,7 +6115,8 @@ unsigned long kvm_mmu_calculate_default_mmu_pages(struct kvm *kvm)
>  void kvm_mmu_destroy(struct kvm_vcpu *vcpu)
>  {
>  	kvm_mmu_unload(vcpu);
> -	free_mmu_pages(vcpu);
> +	free_mmu_pages(&vcpu->arch.root_mmu);
> +	free_mmu_pages(&vcpu->arch.guest_mmu);
>  	mmu_free_memory_caches(vcpu);
>  }
>
> --
> 2.23.0.rc1
>

-- 
Vitaly



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux