On Tue, Jan 26, 2021 at 12:48 PM Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: > > On 26/01/21 19:11, Ben Gardon wrote: > > When I did a strict replacement I found ~10% worse memory population > > performance. > > Running dirty_log_perf_test -v 96 -b 3g -i 5 with the TDP MMU > > disabled, I got 119 sec to populate memory as the baseline and 134 sec > > with an earlier version of this series which just replaced the > > spinlock with an rwlock. I believe this difference is statistically > > significant, but didn't run multiple trials. > > I didn't take notes when profiling, but I'm pretty sure the rwlock > > slowpath showed up a lot. This was a very high contention scenario, so > > it's probably not indicative of real-world performance. > > In the slow path, the rwlock is certainly slower than a spin lock. > > > > If the real impact doesn't seem too large, I'd be very happy to just > > replace the spinlock. > > Ok, so let's use the union idea and add a "#define KVM_HAVE_MMU_RWLOCK" > to x86. The virt/kvm/kvm_main.c MMU notifiers functions can use the > #define to pick between write_lock and spin_lock. I'm not entirely sure I understand this suggestion. Are you suggesting we'd have the spinlock and rwlock in a union in struct kvm but then use a static define to choose which one is used by other functions? It seems like if we're using static defines the union doesn't add value. If we do use the union, I think the advantages offered by __weak wrapper functions, overridden on a per-arch basis, are worthwhile. > > For x86 I want to switch to tdp_mmu=1 by default as soon as parallel > page faults are in, so we can use the rwlock unconditionally and drop > the wrappers, except possibly for some kind of kvm_mmu_lock/unlock_root > that choose between read_lock for TDP MMU and write_lock for shadow MMU. > > Thanks! > > Paolo >