On Fri, Nov 19, 2021, David Matlack wrote: > Factor out the logic to atomically replace an SPTE with an SPTE that > points to a new page table. This will be used in a follow-up commit to > split a large page SPTE into one level lower. > > Signed-off-by: David Matlack <dmatlack@xxxxxxxxxx> > --- > arch/x86/kvm/mmu/tdp_mmu.c | 53 ++++++++++++++++++++++++++------------ > 1 file changed, 37 insertions(+), 16 deletions(-) > > diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c > index cc9fe33c9b36..9ee3f4f7fdf5 100644 > --- a/arch/x86/kvm/mmu/tdp_mmu.c > +++ b/arch/x86/kvm/mmu/tdp_mmu.c > @@ -945,6 +945,39 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu, > return ret; > } > > +/* > + * tdp_mmu_install_sp_atomic - Atomically replace the given spte with an > + * spte pointing to the provided page table. > + * > + * @kvm: kvm instance > + * @iter: a tdp_iter instance currently on the SPTE that should be set > + * @sp: The new TDP page table to install. > + * @account_nx: True if this page table is being installed to split a > + * non-executable huge page. > + * > + * Returns: True if the new page table was installed. False if spte being > + * replaced changed, causing the atomic compare-exchange to fail. > + * If this function returns false the sp will be freed before > + * returning. > + */ > +static bool tdp_mmu_install_sp_atomic(struct kvm *kvm, > + struct tdp_iter *iter, > + struct kvm_mmu_page *sp, > + bool account_nx) > +{ > + u64 spte; > + > + spte = make_nonleaf_spte(sp->spt, !shadow_accessed_mask); This can easily go on one line. u64 spte = make_nonleaf_spte(sp->spt, !shadow_accessed_mask); > + > + if (tdp_mmu_set_spte_atomic(kvm, iter, spte)) { > + tdp_mmu_link_page(kvm, sp, account_nx); > + return true; > + } else { > + tdp_mmu_free_sp(sp); > + return false; I don't think this helper should free the sp on failure, even if that's what all paths end up doing. When reading the calling code, it really looks like the sp is being leaked because the allocation and free are in different contexts. That the sp is consumed on success is fairly intuitive given the "install" action, but freeing on failure not so much. And for the eager splitting, freeing on failure is wasteful. It's extremely unlikely to happen often, so in practice it's unlikely to be an issue, but it's certainly odd since the loop is likely going to immediately allocate another sp, either for the current spte or for the next spte. Side topic, tdp_mmu_set_spte_atomic() and friends really should return 0/-EBUSY. Boolean returns for errors usually end in tears sooner or later.