Joerg Roedel wrote:
This patch adds support for 1GB pages in the shadow paging code. The
guest can map 1GB pages in his page tables and KVM will map the page
frame with a 1GB, a 2MB or even a 4kb page size, according to backing
host page size and the write protections in place.
This is the theory. In practice there are conditions which turn the
guest unstable when running with this patch and GB pages enabled. The
failing conditions are:
* KVM is loaded using shadow paging
* The Linux guest uses GB pages for the kernel direct mapping
* The guest memory is backed with 4kb pages on the host side
With the above configuration there are random application or kernel
crashed when the guest runs under load. When GB pages for HugeTLBfs in
the guest are allocated at boot time in the guest the guest kernel
crashes or stucks at boot depending on the amount of RAM in the guest.
The following parameters have no impact:
* It bug occurs also without guest SMP (so likely no race
condition)
* Use PV-MMU makes no difference
I have searched this bug for quite some time with no real luck. Maybe
some other reviewers have more luck than I had by now.
Signed-off-by: Joerg Roedel <joerg.roedel@xxxxxxx>
@@ -729,7 +730,9 @@ static int rmap_write_protect(struct kvm *kvm, u64 gfn)
}
/* check for huge page mappings */
- rmapp = gfn_to_rmap(kvm, gfn, KVM_PAGE_SIZE_2M);
+ psize = KVM_PAGE_SIZE_2M;
+again:
+ rmapp = gfn_to_rmap(kvm, gfn, psize);
spte = rmap_next(kvm, rmapp, NULL);
while (spte) {
BUG_ON(!spte);
@@ -737,7 +740,7 @@ static int rmap_write_protect(struct kvm *kvm, u64 gfn)
BUG_ON((*spte & (PT_PAGE_SIZE_MASK|PT_PRESENT_MASK)) != (PT_PAGE_SIZE_MASK|PT_PRESENT_MASK));
pgprintk("rmap_write_protect(large): spte %p %llx %lld\n", spte, *spte, gfn);
if (is_writeble_pte(*spte)) {
- rmap_remove(kvm, spte, KVM_PAGE_SIZE_2M);
+ rmap_remove(kvm, spte, psize);
--kvm->stat.lpages;
set_shadow_pte(spte, shadow_trap_nonpresent_pte);
spte = NULL;
@@ -746,6 +749,11 @@ static int rmap_write_protect(struct kvm *kvm, u64 gfn)
spte = rmap_next(kvm, rmapp, spte);
}
+ if (psize == KVM_PAGE_SIZE_2M) {
+ psize = KVM_PAGE_SIZE_1G;
+ goto again;
+ }
+
Ugh, use a real loop.
return write_protected;
}
@@ -789,11 +797,14 @@ static int kvm_handle_hva(struct kvm *kvm, unsigned long hva,
if (hva >= start && hva < end) {
gfn_t gfn_offset = (hva - start) >> PAGE_SHIFT;
unsigned long lidx = gfn_offset / KVM_PAGES_PER_2M_PAGE;
+ unsigned long hidx = gfn_offset / KVM_PAGES_PER_1G_PAGE;
retval |= handler(kvm, &memslot->rmap[gfn_offset],
KVM_PAGE_SIZE_4k);
retval |= handler(kvm,
&memslot->lpage_info[lidx].rmap_pde,
KVM_PAGE_SIZE_2M);
+ retval |= handler(kvm, &memslot->hpage_info[hidx].rmap_pde,
+ KVM_PAGE_SIZE_1G);
}
}
Isn't this needed for tdp as well?
--
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html