Re: [PATCH v2 4/4] KVM: x86/mmu: Recover TDP MMU NX huge pages using MMU read lock

Vipin Sharma <vipinsh@xxxxxxxxxx> · Tue, 3 Sep 2024 13:23:39 -0700

On 2024-08-29 13:06:55, Sean Christopherson wrote:
> On Thu, Aug 29, 2024, Vipin Sharma wrote:
> >  		if (need_resched() || rwlock_needbreak(&kvm->mmu_lock)) {
> 
> Tsk, tsk.  There's a cond_resched_rwlock_write() lurking here.

I'm gonna order a dunce cap.

> 
> Which is a decent argument/segue for my main comment: I would very, very strongly
> prefer to have a single flow for the control logic.  Almost all of the code is
> copy+pasted to the TDP MMU, and there unnecessary and confusing differences between
> the two flows.  E.g. the TDP MMU does unaccount_nx_huge_page() preemptively, while
> the shadow MMU does not.
> 
> The TDP MMU has a few extra "locks" (rcu_read_lock() + tdp_mmu_pages_lock), but
> I don't see any harm in taking those in the shadow MMU flow.  KVM holds a spinlock
> and so RCU practically is meaningless, and tdp_mmu_pages_lock will never be
> contended since it's only ever taken under mmu_lock.
> 
> If we _really_ wanted to bury tdp_mmu_pages_lock in the TDP MMU, it could be
> passed in as an optional paramter.  I'm not super opposed to that, it just looks
> ugly, and I'm not convinced it's worth the effort.  Maybe a middle ground would
> be to provide a helper so that tdp_mmu_pages_lock can stay 64-bit only, but this
> code doesn't need #ifdefs?

How about declaring in tdp_mmu.h and providing different definitions
similar to is_tdp_mmu_page(), i.e. no-op for 32bit and actual lock usage
for 64 bit build?

> 
> Anyways, if the helper is __always_inline, there shouldn't be an indirect call
> to recover_page().  Completely untested, but this is the basic gist.
> 

One more way I can do is to reuse "shared" bool to decide which
nx_recover_page_t to call. Something like:

static __always_inline void kvm_recover_nx_huge_pages(...)
{
...
    if (!kvm_mmu_sp_dirty_logging_enabled(kvm, sp))
	if (shared)
		flush |= tdp_mmu_zap_possible_nx_huge_page(kvm, sp);
	else
		kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list);

}

tdp_mmu_zap_possible_nx_huge_page() can print WARN_ONCE() and return if
shadow MMU page is passed to it. One downside is now that "shared" bool
and "pages" list are dependent on each other.

This way we don't need to rely on __always_inline to do the right thing
and avoid function pointer call.

I think using bool is more easier to understand its uage compared to
understanding why we used __always_inline. What do you think?

> ---
> typedef bool (*nx_recover_page_t)(struct kvm *kvm, struct kvm_mmu_page *sp,
> 				  struct list_head *invalid_pages);
> 
> static __always_inline void kvm_recover_nx_huge_pages(struct kvm *kvm,
> 						      bool shared,
> 						      spinlock_t *list_lock;
> 						      struct list_head *pages,
> 						      unsigned long nr_pages,
> 						      nx_recover_page_t recover_page)