On Mon, Apr 03, 2017 at 03:22:11PM +0100, Mark Rutland wrote: > Hi, > > On Mon, Apr 03, 2017 at 03:12:43PM +0100, Suzuki K Poulose wrote: > > In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling > > unmap_stage2_range() on the entire memory range for the guest. This could > > cause problems with other callers (e.g, munmap on a memslot) trying to > > unmap a range. And since we have to unmap the entire Guest memory range > > holding a spinlock, make sure we yield the lock if necessary, after we > > unmap each PUD range. > > > > Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup") > > Cc: stable@xxxxxxxxxxxxxxx # v3.10+ > > Cc: Paolo Bonzini <pbonzin@xxxxxxxxxx> > > Cc: Marc Zyngier <marc.zyngier@xxxxxxx> > > Cc: Christoffer Dall <christoffer.dall@xxxxxxxxxx> > > Cc: Mark Rutland <mark.rutland@xxxxxxx> > > Signed-off-by: Suzuki K Poulose <suzuki.poulose@xxxxxxx> > > [ Avoid vCPU starvation and lockup detector warnings ] > > Signed-off-by: Marc Zyngier <marc.zyngier@xxxxxxx> > > Signed-off-by: Suzuki K Poulose <suzuki.poulose@xxxxxxx> > > > > --- > > Changes since V2: > > - Restrict kvm->mmu_lock relaxation to bigger ranges in unmap_stage2_range(), > > to avoid possible issues like [0] > > > > [0] http://lists.infradead.org/pipermail/linux-arm-kernel/2017-March/498210.html > > Sorry if I'm being thick, but how does restricting this to a larger > range help with the "sleeping function called from invalid context" > issue? > > Surely that just makes it rarer? As far as I can tell, the unmap_stage2_range() function is only called in the problematic path which has the extra lock taken rom try_to_unmap_one() via the kvm_unmap_hva() function, which always passes PAGE_SIZE as the argument, which is always smaller than S2_PUD_SIZE. Did I miss something? Thanks, -Christoffer > > > > > Changes since V1: > > - Yield the kvm->mmu_lock if necessary in unmap_stage2_range to prevent > > vCPU starvation and lockup detector warnings. > > --- > > arch/arm/kvm/mmu.c | 10 ++++++++++ > > 1 file changed, 10 insertions(+) > > > > diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c > > index 13b9c1f..db94f3a 100644 > > --- a/arch/arm/kvm/mmu.c > > +++ b/arch/arm/kvm/mmu.c > > @@ -292,8 +292,15 @@ static void unmap_stage2_range(struct kvm *kvm, phys_addr_t start, u64 size) > > phys_addr_t addr = start, end = start + size; > > phys_addr_t next; > > > > + assert_spin_locked(&kvm->mmu_lock); > > pgd = kvm->arch.pgd + stage2_pgd_index(addr); > > do { > > + /* > > + * If the range is too large, release the kvm->mmu_lock > > + * to prevent starvation and lockup detector warnings. > > + */ > > + if (size > S2_PUD_SIZE) > > + cond_resched_lock(&kvm->mmu_lock); > > next = stage2_pgd_addr_end(addr, end); > > if (!stage2_pgd_none(*pgd)) > > unmap_stage2_puds(kvm, pgd, addr, next); > > @@ -831,7 +838,10 @@ void kvm_free_stage2_pgd(struct kvm *kvm) > > if (kvm->arch.pgd == NULL) > > return; > > > > + spin_lock(&kvm->mmu_lock); > > unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE); > > + spin_unlock(&kvm->mmu_lock); > > + > > /* Free the HW pgd, one page at a time */ > > free_pages_exact(kvm->arch.pgd, S2_PGD_SIZE); > > kvm->arch.pgd = NULL; > > -- > > 2.7.4 > >