Re: [RFC PATCH V3 2/4] KVM: X86: Introduce role.glevel for level expanded pagetable

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 30, 2022, Lai Jiangshan wrote:
> +  role.glevel:
> +    The level in guest pagetable if the sp is indirect.  Is 0 if the sp
> +    is direct without corresponding guest pagetable, like TDP or !CR0.PG.
> +    When role.level > guest paging level, indirect sp is created on the
> +    top with role.glevel = guest paging level and acks as passthrough sp

s/acks/acts

> +    and its contents are specially installed rather than the translations
> +    of the corresponding guest pagetable.
>    gfn:
>      Either the guest page table containing the translations shadowed by this
>      page, or the base page frame for linear translations.  See role.direct.
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 9694dd5e6ccc..67e1bccaf472 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -314,7 +314,7 @@ struct kvm_kernel_irq_routing_entry;
>   *     cr0_wp=0, therefore these three bits only give rise to 5 possibilities.
>   *
>   * Therefore, the maximum number of possible upper-level shadow pages for a
> - * single gfn is a bit less than 2^13.
> + * single gfn is a bit less than 2^15.
>   */
>  union kvm_mmu_page_role {
>  	u32 word;
> @@ -331,7 +331,8 @@ union kvm_mmu_page_role {
>  		unsigned smap_andnot_wp:1;
>  		unsigned ad_disabled:1;
>  		unsigned guest_mode:1;
> -		unsigned :6;
> +		unsigned glevel:4;

We don't need 4 bits for this.  Crossing our fingers that we never had to shadow
a 2-level guest with a 6-level host, we can do:

		unsigned passthrough_delta:2;

Where the field is ignored if direct=1, '0' for non-passthrough, and 1-3 to handle
shadow_root_level - guest_root_level.  Basically the same idea as Paolo's smushing
of direct+passthrough into mapping_level, just dressed up differently.

Side topic, we should steal a bit back from "level", or at least document that we
can steal a bit if necessary.

> +		unsigned :2;
>  
>  		/*
>  		 * This is left at the top of the word so that
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 02eae110cbe1..d53037df8177 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -737,8 +737,12 @@ static void mmu_free_pte_list_desc(struct pte_list_desc *pte_list_desc)
>  
>  static gfn_t kvm_mmu_page_get_gfn(struct kvm_mmu_page *sp, int index)
>  {
> -	if (!sp->role.direct)
> +	if (!sp->role.direct) {
> +		if (unlikely(sp->role.glevel < sp->role.level))

Regardless of whatever magic we end up using, there should be an is_passthrough_sp()
helper to wrap the magic.

> +			return sp->gfn;
> +
>  		return sp->gfns[index];
> +	}
>  
>  	return sp->gfn + (index << ((sp->role.level - 1) * PT64_LEVEL_BITS));
>  }



[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux