Explicitly check kvm_shadow_root_alloced() when short-circuiting shadow paging metadata allocations and skip setting "shadow_root_alloced" if and only if its already true, i.e. set it when short-circuiting because TDP is disabled. This fixes a benign bug where KVM would always take slots_arch_lock when allocating a shadow root due to "shadow_root_alloced" never being set. Opportunistically add comments to call out that not freeing successful allocations on failure is intentional, and that freeing on failure isn't straightforward so as to discourage incorrect cleanups in the future. Fixes: 73f122c4f06f ("KVM: cleanup allocation of rmaps and page tracking data") Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx> --- Essentially code review for "KVM: cleanup allocation of rmaps and page tracking data", which AFAICT didn't get posted (because it came in via a a merge?). arch/x86/kvm/mmu/mmu.c | 22 ++++++++++++++++------ 1 file changed, 16 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index c6ddb042b281..757e2a1ed149 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -3412,21 +3412,30 @@ static int mmu_first_shadow_root_alloc(struct kvm *kvm) mutex_lock(&kvm->slots_arch_lock); + /* Recheck, under the lock, whether this is the first shadow root. */ + if (kvm_shadow_root_alloced(kvm)) + goto out_unlock; + /* - * Check if anything actually needs to be allocated. This also - * rechecks whether this is the first shadow root under the lock. + * Check if anything actually needs to be allocated, e.g. all metadata + * will be allocated upfront if TDP is disabled. */ if (kvm_memslots_have_rmaps(kvm) && kvm_page_track_write_tracking_enabled(kvm)) - goto out_unlock; + goto out_success; for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) { slots = __kvm_memslots(kvm, i); kvm_for_each_memslot(slot, slots) { /* - * Both of these functions are no-ops if the target - * is already allocated, so unconditionally calling - * both is safe. + * Both of these functions are no-ops if the target is + * already allocated, so unconditionally calling both + * is safe. Intentionally do NOT free allocations on + * failure to avoid having to track which allocations + * were made now versus when the memslot was created. + * The metadata is guaranteed to be freed when the slot + * is freed, and will be kept/used if userspace retries + * KVM_RUN instead of killing the VM. */ r = memslot_rmap_alloc(slot, slot->npages); if (r) @@ -3441,6 +3450,7 @@ static int mmu_first_shadow_root_alloc(struct kvm *kvm) * Ensure that shadow_root_alloced becomes true strictly after * all the related pointers are set. */ +out_success: smp_store_release(&kvm->arch.shadow_root_alloced, true); out_unlock: -- 2.33.0.1079.g6e70778dc9-goog