Re: [patch 2/5] KVM: MMU: allow pinning spte translations (TDP-only)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 06/19/2014 02:12 AM, mtosatti@xxxxxxxxxx wrote:
Allow vcpus to pin spte translations by:

1) Creating a per-vcpu list of pinned ranges.
2) On mmu reload request:
	- Fault ranges.
	- Mark sptes with a pinned bit.
	- Mark shadow pages as pinned.

3) Then modify the following actions:
	- Page age => skip spte flush.
	- MMU notifiers => force mmu reload request (which kicks cpu out of
				guest mode).
	- GET_DIRTY_LOG => force mmu reload request.
	- SLAB shrinker => skip shadow page deletion.

TDP-only.

+int kvm_mmu_register_pinned_range(struct kvm_vcpu *vcpu,
+				  gfn_t base_gfn, unsigned long npages)
+{
+	struct kvm_pinned_page_range *p;
+
+	mutex_lock(&vcpu->arch.pinned_mmu_mutex);
+	list_for_each_entry(p, &vcpu->arch.pinned_mmu_pages, link) {
+		if (p->base_gfn == base_gfn && p->npages == npages) {
+			mutex_unlock(&vcpu->arch.pinned_mmu_mutex);
+			return -EEXIST;
+		}
+	}
+	mutex_unlock(&vcpu->arch.pinned_mmu_mutex);
+
+	if (vcpu->arch.nr_pinned_ranges >=
+	    KVM_MAX_PER_VCPU_PINNED_RANGE)
+		return -ENOSPC;
+
+	p = kzalloc(sizeof(struct kvm_pinned_page_range), GFP_KERNEL);
+	if (!p)
+		return -ENOMEM;
+
+	vcpu->arch.nr_pinned_ranges++;
+
+	trace_kvm_mmu_register_pinned_range(vcpu->vcpu_id, base_gfn, npages);
+
+	INIT_LIST_HEAD(&p->link);
+	p->base_gfn = base_gfn;
+	p->npages = npages;
+	mutex_lock(&vcpu->arch.pinned_mmu_mutex);
+	list_add(&p->link, &vcpu->arch.pinned_mmu_pages);
+	mutex_unlock(&vcpu->arch.pinned_mmu_mutex);
+	kvm_make_request(KVM_REQ_MMU_RELOAD, vcpu);
+
+	return 0;
+}
+

What happens if ranges overlap (within a vcpu, cross-vcpu)? Or if a range overflows and wraps around 0? Or if it does not refer to RAM?

Looks like you're limiting the number of ranges, but not the number of pages, so a guest can lock all of its memory.

+
+/*
+ * Pin KVM MMU page translations. This guarantees, for valid
+ * addresses registered by kvm_mmu_register_pinned_range (valid address
+ * meaning address which posses sufficient information for fault to
+ * be resolved), valid translations exist while in guest mode and
+ * therefore no VM-exits due to faults will occur.
+ *
+ * Failure to instantiate pages will abort guest entry.
+ *
+ * Page frames should be pinned with get_page in advance.
+ *
+ * Pinning is not guaranteed while executing as L2 guest.

Does this undermine security?

+ *
+ */
+
+static void kvm_mmu_pin_pages(struct kvm_vcpu *vcpu)
+{
+	struct kvm_pinned_page_range *p;
+
+	if (is_guest_mode(vcpu))
+		return;
+
+	if (!vcpu->arch.mmu.direct_map)
+		return;
+
+	ASSERT(VALID_PAGE(vcpu->arch.mmu.root_hpa));
+
+	mutex_lock(&vcpu->arch.pinned_mmu_mutex);

Is the mutex actually needed? It seems it's only taken in vcpu context, so the vcpu mutex should be sufficient.

+	list_for_each_entry(p, &vcpu->arch.pinned_mmu_pages, link) {
+		gfn_t gfn_offset;
+
+		for (gfn_offset = 0; gfn_offset < p->npages; gfn_offset++) {
+			gfn_t gfn = p->base_gfn + gfn_offset;
+			int r;
+			bool pinned = false;
+
+			r = vcpu->arch.mmu.page_fault(vcpu, gfn << PAGE_SHIFT,
+						     PFERR_WRITE_MASK, false,
+						     true, &pinned);
+			/* MMU notifier sequence window: retry */
+			if (!r && !pinned)
+				kvm_make_request(KVM_REQ_MMU_RELOAD, vcpu);
+			if (r) {
+				kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
+				break;
+			}
+
+		}
+	}
+	mutex_unlock(&vcpu->arch.pinned_mmu_mutex);
+}
+
  int kvm_mmu_load(struct kvm_vcpu *vcpu)
  {
  	int r;
@@ -3916,6 +4101,7 @@
  		goto out;
  	/* set_cr3() should ensure TLB has been flushed */
  	vcpu->arch.mmu.set_cr3(vcpu, vcpu->arch.mmu.root_hpa);
+	kvm_mmu_pin_pages(vcpu);
  out:
  	return r;
  }


I don't see where you unpin pages, so even if you limit the number of pinned pages, a guest can pin all of memory by iterating over all of memory and pinning it a chunk at a time.

You might try something similar to guest MTRR handling.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux