On (21/10/19 15:44), David Matlack wrote: > On Tue, Oct 19, 2021 at 8:32 AM Sergey Senozhatsky > <senozhatsky@xxxxxxxxxxxx> wrote: > > > > kvm_mmu_pte_prefetch is a per-VCPU structure that holds a PTE > > prefetch pages array, lock and the number of PTE to prefetch. > > > > This is needed to turn PTE_PREFETCH_NUM into a tunable VM > > parameter. > > > > Signed-off-by: Sergey Senozhatsky <senozhatsky@xxxxxxxxxxxx> > > --- > > arch/x86/include/asm/kvm_host.h | 12 +++++++ > > arch/x86/kvm/mmu.h | 4 +++ > > arch/x86/kvm/mmu/mmu.c | 57 ++++++++++++++++++++++++++++++--- > > arch/x86/kvm/x86.c | 9 +++++- > > 4 files changed, 77 insertions(+), 5 deletions(-) > > > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > > index 5271fce6cd65..11400bc3c70d 100644 > > --- a/arch/x86/include/asm/kvm_host.h > > +++ b/arch/x86/include/asm/kvm_host.h > > @@ -607,6 +607,16 @@ struct kvm_vcpu_xen { > > u64 runstate_times[4]; > > }; > > > > +struct kvm_mmu_pte_prefetch { > > + /* > > + * This will be cast either to array of pointers to struct page, > > + * or array of u64, or array of u32 > > + */ > > + void *ents; > > + unsigned int num_ents; > > + spinlock_t lock; > > The spinlock is overkill. I'd suggest something like this: > - When VM-ioctl is invoked to update prefetch count, store it in > kvm_arch. No synchronization with vCPUs needed. > - When a vCPU takes a fault: Read the prefetch count from kvm_arch. If > different than count at last fault, re-allocate vCPU prefetch array. > (So you'll need to add prefetch array and count to kvm_vcpu_arch as > well.) > > No extra locks are needed. vCPUs that fault after the VM-ioctl will > get the new prefetch count. We don't really care if a prefetch count > update races with a vCPU fault as long as vCPUs are careful to only > read the count once (i.e. use READ_ONCE(vcpu->kvm.prefetch_count)) and > use that. Assuming prefetch count ioctls are rare, the re-allocation > on the fault path will be rare as well. So reallocation from the faul-path should happen before vCPU takes the mmu_lock? And READ_ONCE(prefetch_count) should also happen before vCPU takes mmu_lock, I assume, so we need to pass it as a parameter to all the functions that will access prefetch array. > Note: You could apply this same approach to a module param, except > vCPUs would be reading the module param rather than vcpu->kvm during > each fault. > > And the other alternative, like you suggested in the other patch, is > to use a vCPU ioctl. That would side-step the synchronization issue > because vCPU ioctls require the vCPU mutex. So the reallocation could > be done in the ioctl and not at fault time. One more idea, wonder what do you think: There is an upper limit on the number of PTEs we prefault, which is 128 as of now, but I think 64 will be good enough, or maybe even 32. So we can always allocate MAX_PTE_PREFETCH_NUM arrays in vcpu->arch and ioctl() will change ->num_ents only, which is always in (0, MAX_PTE_PREFETCH_NUM - 1] range. This way we never have to reallocate anything, we just adjust the "maximum index" value. > Taking a step back, can you say a bit more about your usecase? We are looking at various ways of reducing the number of vm-exits. There is only one VM running on the device (a pretty low-end laptop).