On 18/04/17 10:08, Mark Rutland wrote:
On Tue, Apr 18, 2017 at 09:32:31AM +0100, Mark Rutland wrote:
Hi Suzuki,
On Thu, Apr 13, 2017 at 04:50:46PM +0100, Suzuki K. Poulose wrote:
kvm: Hold reference to the user address space
The core KVM code, uses mmgrab/mmdrop to pin the mm struct of the user
application. mmgrab only guarantees that the mm struct is available,
while the "real address space" (see Documentation/vm/active_mm.txt) may
be destroyed. Since the KVM depends on the user space page tables for
the Guest pages, we should instead do an mmget/mmput. Even though
mmget/mmput is not encouraged for uses with unbounded time, the KVM
is fine to do so, as we are doing it from the context of the same process.
This also prevents the race condition where mmu_notifier_release() could
be called in parallel and one instance could end up using a free'd kvm
instance.
Cc: Mark Rutland <mark.rutland@xxxxxxx>
Cc: Paolo Bonzin <pbonzini@xxxxxxxxxx>
Cc: Radim Krčmář <rkrcmar@xxxxxxxxxx>
Cc: Marc Zyngier <marc.zyngier@xxxxxxx>
Cc: Christoffer Dall <christoffer.dall@xxxxxxxxxx>
Cc: andreyknvl@xxxxxxxxxx
Signed-off-by: Suzuki K Poulose <suzuki.poulose@xxxxxxx>
---
virt/kvm/kvm_main.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 88257b3..555712e 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -613,7 +613,7 @@ static struct kvm *kvm_create_vm(unsigned long type)
return ERR_PTR(-ENOMEM);
spin_lock_init(&kvm->mmu_lock);
- mmgrab(current->mm);
+ mmget(current->mm);
kvm->mm = current->mm;
kvm_eventfd_init(kvm);
mutex_init(&kvm->lock);
@@ -685,7 +685,7 @@ static struct kvm *kvm_create_vm(unsigned long type)
for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++)
kvm_free_memslots(kvm, kvm->memslots[i]);
kvm_arch_free_vm(kvm);
- mmdrop(current->mm);
+ mmput(current->mm);
return ERR_PTR(r);
}
@@ -747,7 +747,7 @@ static void kvm_destroy_vm(struct kvm *kvm)
kvm_arch_free_vm(kvm);
preempt_notifier_dec();
hardware_disable_all();
- mmdrop(mm);
+ mmput(mm);
}
As a heads-up, I'm seeing what looks to be a KVM memory leak with this
patch applied atop of next-20170411.
I don't yet know if this is a problem with next-20170411 or this patch
in particular -- I will try to track that down. In the mean time, info
dump below.
This is indeed a side effect of the new patch. The VCPU doesn't get released
completely, due to an mmap count held on the VCPU fd, even when we close the
VCPU fd. This keeps the refcount on the KVM instance which in turn holds the
mmap count (with the new patch). So the mmap count on VCPU will never get
released due to the circular dependency here. :-(
Suzuki