[PATCH 0/2] optimize spte zapping in zap_gfn_range()

Mingwei Zhang <mizhang@xxxxxxxxxx> · Wed, 24 Nov 2021 21:44:19 +0000

TDP MMU SPTE zapping process currently uses two levels of iterations. The
first level iteration happens at the for loop within the zap_gfn_range()
with the purpose of calibrating the accurate range for zapping. The second
level itreration start at tdp_mmu_set_spte{,_atomic}() that tears down the
whole paging structures (leaf and non-leaf SPTEs) within the range. The
former iteration is yield safe, while the second one is not.

In many cases, zapping SPTE process could be optimized since the non-leaf
SPTEs could most likely be retained for the next allocation. On the other
hand, for large scale SPTE zapping scenarios, we may end up zapping too
many SPTEs and use excessive CPU time that causes the RCU stall warning.

The follow selftest reproduces the warning:

        (env: kvm.tdp_mmu=Y)
        ./dirty_log_perf_test -v 64 -b 8G

This patch set revert a previous optimization and create a helper
__zap_gfn_range() to help optimize the zapping process.

In particular, it does the following two things:
 - optimize the zapping by retaining some non-leaf SPTEs.
 - avoid RCU stall warning when zapping too many SPTEs.

Mingwei Zhang (2):
  Revert "KVM: x86/mmu: Don't step down in the TDP iterator when zapping
    all SPTEs"
  KVM: mmu/x86: optimize zapping by retaining non-leaf SPTEs and avoid
    rcu stall

 arch/x86/kvm/mmu/tdp_mmu.c | 66 +++++++++++++++++++++++---------------
 1 file changed, 41 insertions(+), 25 deletions(-)

--
2.34.0.rc2.393.gf8c9666880-goog