Do not recover (i.e. zap) an NX Huge Page that is being dirty tracked, as it will just be faulted back in at the same 4KiB granularity when accessed by a vCPU. This may need to be changed if KVM ever supports 2MiB (or larger) dirty tracking granularity, or faulting huge pages during dirty tracking for reads/executes. However for now, these zaps are entirely wasteful. This commit does nominally increase the CPU usage of the NX recover worker by about 1% when testing with a VM with 16 slots. Signed-off-by: David Matlack <dmatlack@xxxxxxxxxx> --- In order to check if this commit increases the CPU usage of the NX recovery worker thread I used a modified version of execute_perf_test [1] that supports splitting guest memory into multiple slots and reports /proc/pid/schedstat:se.sum_exec_runtime for the NX recovery worker just before tearing down the VM. The goal was to force a large number of NX Huge Page recoveries and see if the recovery worker used any more CPU. Test Setup: echo 1000 > /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms echo 10 > /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio Test Command: ./execute_perf_test -v64 -s anonymous_hugetlb_1gb -x 16 -o | kvm-nx-lpage-re:se.sum_exec_runtime | | ---------------------------------------- | Run | Before | After | ------- | ------------------ | ------------------- | 1 | 730.084105 | 724.375314 | 2 | 728.751339 | 740.581988 | 3 | 736.264720 | 757.078163 | Comparing the median results, this commit results in about a 1% increase CPU usage of the NX recovery worker. [1] https://lore.kernel.org/kvm/20221019234050.3919566-2-dmatlack@xxxxxxxxxx/ v2: - Only skip NX Huge Pages that are actively being dirty tracked [Paolo] v1: https://lore.kernel.org/kvm/20221027200316.2221027-1-dmatlack@xxxxxxxxxx/ arch/x86/kvm/mmu/mmu.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 82bc6321e58e..1c443f9aeb4b 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -6831,6 +6831,7 @@ static int set_nx_huge_pages_recovery_param(const char *val, const struct kernel static void kvm_recover_nx_huge_pages(struct kvm *kvm) { unsigned long nx_lpage_splits = kvm->stat.nx_lpage_splits; + struct kvm_memory_slot *slot; int rcu_idx; struct kvm_mmu_page *sp; unsigned int ratio; @@ -6865,7 +6866,21 @@ static void kvm_recover_nx_huge_pages(struct kvm *kvm) struct kvm_mmu_page, possible_nx_huge_page_link); WARN_ON_ONCE(!sp->nx_huge_page_disallowed); - if (is_tdp_mmu_page(sp)) + WARN_ON_ONCE(!sp->role.direct); + + slot = gfn_to_memslot(kvm, sp->gfn); + WARN_ON_ONCE(!slot); + + /* + * Unaccount and do not attempt to recover any NX Huge Pages + * that are being dirty tracked, as they would just be faulted + * back in as 4KiB pages. The NX Huge Pages in this slot will be + * recovered, along with all the other huge pages in the slot, + * when dirty logging is disabled. + */ + if (slot && kvm_slot_dirty_track_enabled(slot)) + unaccount_nx_huge_page(kvm, sp); + else if (is_tdp_mmu_page(sp)) flush |= kvm_tdp_mmu_zap_sp(kvm, sp); else kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list); -- 2.38.1.431.g37b22c650d-goog