On 3/1/22 20:43, Sean Christopherson wrote:
and after spending quite some time I wonder if all this should just be
if (refcount_dec_not_one(&root->tdp_mmu_root_count))
return;
if (!xchg(&root->role.invalid, true) {
The refcount being '1' means there's another task currently using root, marking
the root invalid will mean checks on the root's validity are non-deterministic
for the other task.
Do you mean it's not possible to use refcount_dec_not_one, otherwise
kvm_tdp_mmu_get_root is not guaranteed to reject the root?
tdp_mmu_zap_root(kvm, root, shared);
/*
* Do not assume the refcount is still 1: because
* tdp_mmu_zap_root can yield, a different task
* might have grabbed a reference to this root.
*
if (refcount_dec_not_one(&root->tdp_mmu_root_count))
This is wrong, _this_ task can't drop a reference taken by the other task.
This is essentially the "kvm_tdp_mmu_put_root(kvm, root, shared);" (or
"goto beginning_of_function;") part of your patch.
return;
}
/*
* The root is invalid, and its reference count has reached
* zero. It must have been zapped either in the "if" above or
* by someone else, and we're definitely the last thread to see
* it apart from RCU-protected page table walks.
*/
refcount_set(&root->tdp_mmu_root_count, 0);
Not sure what you intended here, KVM should never force a refcount to '0'.
It's turning a refcount_dec_not_one into a refcount_dec_and_test. It
seems legit to me, because the only refcount_inc_not_zero is in a
write-side critical section. If the refcount goes to zero on the
read-side, the root is gone for good.
xchg() is a very good idea. The smp_mb_*() stuff was carried over from the previous
version where this sequence set another flag in addition to role.invalid.
Is this less funky (untested)?
/*
* Invalidate the root to prevent it from being reused by a vCPU while
* the root is being zapped, i.e. to allow yielding while zapping the
* root (see below).
*
* Free the root if it's already invalid. Invalid roots must be zapped
* before their last reference is put, i.e. there's no work to be done,
* and all roots must be invalidated before they're freed (this code).
* Re-zapping invalid roots would put KVM into an infinite loop.
*
* Note, xchg() provides an implicit barrier to ensure role.invalid is
* visible if a concurrent reader acquires a reference after the root's
* refcount is reset.
*/
if (xchg(root->role.invalid, true))
spin_lock(&kvm->arch.tdp_mmu_pages_lock);
list_del_rcu(&root->link);
spin_unlock(&kvm->arch.tdp_mmu_pages_lock);
call_rcu(&root->rcu_head, tdp_mmu_free_sp_rcu_callback);
return;
}
Based on my own version, I guess you mean (without comments due to
family NMI):
if (!refcount_dec_and_test(&root->tdp_mmu_root_count))
return;
if (!xchg(&root->role.invalid, true) {
refcount_set(&root->tdp_mmu_count, 1);
tdp_mmu_zap_root(kvm, root, shared);
if (!refcount_dec_and_test(&root->tdp_mmu_root_count))
return;
}
spin_lock(&kvm->arch.tdp_mmu_pages_lock);
list_del_rcu(&root->link);
spin_unlock(&kvm->arch.tdp_mmu_pages_lock);
call_rcu(&root->rcu_head, tdp_mmu_free_sp_rcu_callback);
Paolo