> > > > 1) Previously mmu_topup_memory_caches() works fine without a lock. > > 2) IMHO I was suspecting if this lock seems affects the parallelization > > of the TDP MMU fault handling. > > > > TDP MMU fault handling is intend to be optimized for parallelization fault > > handling by taking a read lock and operating the page table via atomic > > operations. Multiple fault handling can enter the TDP MMU fault path > > because of read_lock(&vcpu->kvm->mmu_lock) below. > > > > W/ this lock, it seems the part of benefit of parallelization is gone > > because the lock can contend earlier above. Will this cause performance > > regression? > > This is a per vCPU lock, with this lock each vCPU will still be able > to perform parallel fault handling without contending for lock. > I am curious how effective it is by trying to accquiring this per vCPU lock? If a vcpu thread should stay within the (host) kernel (vmx root/non-root) for the vast majority of the time, isn't the shrinker always fail to make any progress?