On 26/01/21 19:11, Ben Gardon wrote:
When I did a strict replacement I found ~10% worse memory population performance. Running dirty_log_perf_test -v 96 -b 3g -i 5 with the TDP MMU disabled, I got 119 sec to populate memory as the baseline and 134 sec with an earlier version of this series which just replaced the spinlock with an rwlock. I believe this difference is statistically significant, but didn't run multiple trials. I didn't take notes when profiling, but I'm pretty sure the rwlock slowpath showed up a lot. This was a very high contention scenario, so it's probably not indicative of real-world performance. In the slow path, the rwlock is certainly slower than a spin lock. If the real impact doesn't seem too large, I'd be very happy to just replace the spinlock.
Ok, so let's use the union idea and add a "#define KVM_HAVE_MMU_RWLOCK" to x86. The virt/kvm/kvm_main.c MMU notifiers functions can use the #define to pick between write_lock and spin_lock.
For x86 I want to switch to tdp_mmu=1 by default as soon as parallel page faults are in, so we can use the rwlock unconditionally and drop the wrappers, except possibly for some kind of kvm_mmu_lock/unlock_root that choose between read_lock for TDP MMU and write_lock for shadow MMU.
Thanks! Paolo