Re: [RFC PATCH 04/15] KVM: x86/mmu: Factor out logic to atomically install a new page table

Sean Christopherson <seanjc@xxxxxxxxxx> · Wed, 1 Dec 2021 19:13:55 +0000

On Fri, Nov 19, 2021, David Matlack wrote:
> Factor out the logic to atomically replace an SPTE with an SPTE that
> points to a new page table. This will be used in a follow-up commit to
> split a large page SPTE into one level lower.
> 
> Signed-off-by: David Matlack <dmatlack@xxxxxxxxxx>
> ---
>  arch/x86/kvm/mmu/tdp_mmu.c | 53 ++++++++++++++++++++++++++------------
>  1 file changed, 37 insertions(+), 16 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index cc9fe33c9b36..9ee3f4f7fdf5 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -945,6 +945,39 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu,
>  	return ret;
>  }
>  
> +/*
> + * tdp_mmu_install_sp_atomic - Atomically replace the given spte with an
> + * spte pointing to the provided page table.
> + *
> + * @kvm: kvm instance
> + * @iter: a tdp_iter instance currently on the SPTE that should be set
> + * @sp: The new TDP page table to install.
> + * @account_nx: True if this page table is being installed to split a
> + *              non-executable huge page.
> + *
> + * Returns: True if the new page table was installed. False if spte being
> + *          replaced changed, causing the atomic compare-exchange to fail.
> + *          If this function returns false the sp will be freed before
> + *          returning.
> + */
> +static bool tdp_mmu_install_sp_atomic(struct kvm *kvm,
> +				      struct tdp_iter *iter,
> +				      struct kvm_mmu_page *sp,
> +				      bool account_nx)
> +{
> +	u64 spte;
> +
> +	spte = make_nonleaf_spte(sp->spt, !shadow_accessed_mask);

This can easily go on one line.

	u64 spte = make_nonleaf_spte(sp->spt, !shadow_accessed_mask);
> +
> +	if (tdp_mmu_set_spte_atomic(kvm, iter, spte)) {
> +		tdp_mmu_link_page(kvm, sp, account_nx);
> +		return true;
> +	} else {
> +		tdp_mmu_free_sp(sp);
> +		return false;

I don't think this helper should free the sp on failure, even if that's what all
paths end up doing.  When reading the calling code, it really looks like the sp
is being leaked because the allocation and free are in different contexts.  That
the sp is consumed on success is fairly intuitive given the "install" action, but
freeing on failure not so much.

And for the eager splitting, freeing on failure is wasteful.  It's extremely
unlikely to happen often, so in practice it's unlikely to be an issue, but it's
certainly odd since the loop is likely going to immediately allocate another sp,
either for the current spte or for the next spte.

Side topic, tdp_mmu_set_spte_atomic() and friends really should return 0/-EBUSY.
Boolean returns for errors usually end in tears sooner or later.