On Tue, 25 Feb 2020 at 22:20, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: > > On 25/02/20 10:47, Wanpeng Li wrote: > > From: Wanpeng Li <wanpengli@xxxxxxxxxxx> > > > > In the vCPU reset and set APIC_BASE MSR path, the apic map will be recalculated > > several times, each time it will consume 10+ us observed by ftrace in my > > non-overcommit environment since the expensive memory allocate/mutex/rcu etc > > operations. This patch optimizes it by recaluating apic map in batch, I hope > > this can benefit the serverless scenario which can frequently create/destroy > > VMs. > > > > Signed-off-by: Wanpeng Li <wanpengli@xxxxxxxxxxx> > > --- > > v1 -> v2: > > * add apic_map_dirty to kvm_lapic > > * error condition in kvm_apic_set_state, do recalcuate unconditionally > > > > arch/x86/kvm/lapic.c | 29 +++++++++++++++++++---------- > > arch/x86/kvm/lapic.h | 2 ++ > > arch/x86/kvm/x86.c | 2 ++ > > 3 files changed, 23 insertions(+), 10 deletions(-) > > > > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c > > index afcd30d..3476dbc 100644 > > --- a/arch/x86/kvm/lapic.c > > +++ b/arch/x86/kvm/lapic.c > > @@ -164,7 +164,7 @@ static void kvm_apic_map_free(struct rcu_head *rcu) > > kvfree(map); > > } > > > > -static void recalculate_apic_map(struct kvm *kvm) > > +void kvm_recalculate_apic_map(struct kvm *kvm) > > { > > It's better to add an "if" here rather than in every caller. It should > be like: > > if (!apic->apic_map_dirty) { > /* > * Read apic->apic_map_dirty before > * kvm->arch.apic_map. > */ > smp_rmb(); > return; > } > > mutex_lock(&kvm->arch.apic_map_lock); > if (!apic->apic_map_dirty) { > /* Someone else has updated the map. */ > mutex_unlock(&kvm->arch.apic_map_lock); > return; > } > ... > out: > old = rcu_dereference_protected(kvm->arch.apic_map, > lockdep_is_held(&kvm->arch.apic_map_lock)); > rcu_assign_pointer(kvm->arch.apic_map, new); > /* > * Write kvm->arch.apic_map before > * clearing apic->apic_map_dirty. > */ > smp_wmb(); > apic->apic_map_dirty = false; > mutex_unlock(&kvm->arch.apic_map_lock); > ... > > But actually it seems to me that, given we're going through all this > pain, it's better to put the "dirty" flag in kvm->arch, next to the > mutex and the map itself. This should also reduce the number of calls > to kvm_recalculate_apic_map that recompute the map. A lot of them will > just wait on the mutex and exit. Good point, will do in next version. Wanpeng