On (21/10/12 09:50), David Matlack wrote: > On Tue, Oct 12, 2021 at 2:16 AM Sergey Senozhatsky > <senozhatsky@xxxxxxxxxxxx> wrote: > > > > Turn PTE_PREFETCH_NUM into a module parameter, so that it > > can be tuned per-VM. > > Module parameters do not allow tuning per VM, they effect every VM on > the machine. > > If you want per-VM tuning you could introduce a VM ioctl. ACK. > > --- > > arch/x86/kvm/mmu/mmu.c | 31 ++++++++++++++++++++++--------- > > Please also update the shadow paging prefetching code in > arch/x86/kvm/mmu/paging_tmpl.h, unless there is a good reason to > diverge. ACK. > > @@ -732,7 +734,7 @@ static int mmu_topup_memory_caches(struct kvm_vcpu *vcpu, bool maybe_indirect) > > > > /* 1 rmap, 1 parent PTE per level, and the prefetched rmaps. */ > > r = kvm_mmu_topup_memory_cache(&vcpu->arch.mmu_pte_list_desc_cache, > > - 1 + PT64_ROOT_MAX_LEVEL + PTE_PREFETCH_NUM); > > + 1 + PT64_ROOT_MAX_LEVEL + pte_prefetch_num); > > There is a sampling problem. What happens if the user changes > pte_prefetch_num while a fault is being handled? Good catch. > > @@ -2753,20 +2755,29 @@ static int direct_pte_prefetch_many(struct kvm_vcpu *vcpu, > > struct kvm_mmu_page *sp, > > u64 *start, u64 *end) > > { > > - struct page *pages[PTE_PREFETCH_NUM]; > > + struct page **pages; > > struct kvm_memory_slot *slot; > > unsigned int access = sp->role.access; > > int i, ret; > > gfn_t gfn; > > > > + pages = kmalloc_array(pte_prefetch_num, sizeof(struct page *), > > + GFP_KERNEL); > > This code runs with the MMU lock held. From > In general we avoid doing any dynamic memory allocation while the MMU > lock is held. That's why the memory caches exist. You can avoid > allocating under a lock by allocating the prefetch array when the vCPU > is first initialized. This would also solve the module parameter > sampling problem because you can read it once and store it in struct > kvm_vcpu. I'll do per-VCPU pre-allocation, thanks. GFP_KERNEL is less of a problem if we hold read kvm->mmu_lock, but more so if we hold write kvm->mmu_lock. > > static void __direct_pte_prefetch(struct kvm_vcpu *vcpu, > > @@ -2785,10 +2798,10 @@ static void __direct_pte_prefetch(struct kvm_vcpu *vcpu, > > > > WARN_ON(!sp->role.direct); > > > > - i = (sptep - sp->spt) & ~(PTE_PREFETCH_NUM - 1); > > + i = (sptep - sp->spt) & ~(pte_prefetch_num - 1); > > This code assumes pte_prefetch_num is a power of 2, which is now no > longer guaranteed to be true. It does. I can test if it's a pow(2) in ioctl