TDP MMU SPTE zapping process currently uses two levels of iterations. The first level iteration happens at the for loop within the zap_gfn_range() with the purpose of calibrating the accurate range for zapping. The second level itreration start at tdp_mmu_set_spte{,_atomic}() that tears down the whole paging structures (leaf and non-leaf SPTEs) within the range. The former iteration is yield safe, while the second one is not. In many cases, zapping SPTE process could be optimized since the non-leaf SPTEs could most likely be retained for the next allocation. On the other hand, for large scale SPTE zapping scenarios, we may end up zapping too many SPTEs and use excessive CPU time that causes the RCU stall warning. The follow selftest reproduces the warning: (env: kvm.tdp_mmu=Y) ./dirty_log_perf_test -v 64 -b 8G This patch set revert a previous optimization and create a helper __zap_gfn_range() to help optimize the zapping process. In particular, it does the following two things: - optimize the zapping by retaining some non-leaf SPTEs. - avoid RCU stall warning when zapping too many SPTEs. Mingwei Zhang (2): Revert "KVM: x86/mmu: Don't step down in the TDP iterator when zapping all SPTEs" KVM: mmu/x86: optimize zapping by retaining non-leaf SPTEs and avoid rcu stall arch/x86/kvm/mmu/tdp_mmu.c | 66 +++++++++++++++++++++++--------------- 1 file changed, 41 insertions(+), 25 deletions(-) -- 2.34.0.rc2.393.gf8c9666880-goog