Re: [PATCH v3 03/16] KVM: arm64: nv: Handle shadow stage 2 page faults

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 22, 2024 at 03:11:16AM +0800, Zenghui Yu wrote:
> > +
> > +	if (nested) {
> > +		unsigned long max_map_size;
> > +
> > +		max_map_size = force_pte ? PAGE_SIZE : PUD_SIZE;
> > +
> > +		ipa = kvm_s2_trans_output(nested);
> > +
> > +		/*
> > +		 * If we're about to create a shadow stage 2 entry, then we
> > +		 * can only create a block mapping if the guest stage 2 page
> > +		 * table uses at least as big a mapping.
> > +		 */
> > +		max_map_size = min(kvm_s2_trans_size(nested), max_map_size);
> > +
> > +		/*
> > +		 * Be careful that if the mapping size falls between
> > +		 * two host sizes, take the smallest of the two.
> > +		 */
> > +		if (max_map_size >= PMD_SIZE && max_map_size < PUD_SIZE)
> > +			max_map_size = PMD_SIZE;
> > +		else if (max_map_size >= PAGE_SIZE && max_map_size < PMD_SIZE)
> > +			max_map_size = PAGE_SIZE;
> > +
> > +		force_pte = (max_map_size == PAGE_SIZE);
> > +		vma_pagesize = min(vma_pagesize, (long)max_map_size);
> > +	}
> > +
> >  	if (vma_pagesize == PMD_SIZE || vma_pagesize == PUD_SIZE)
> >  		fault_ipa &= ~(vma_pagesize - 1);
> >  
> > -	gfn = fault_ipa >> PAGE_SHIFT;
> > +	gfn = ipa >> PAGE_SHIFT;
> 
> I had seen a non-nested guest boot failure (with vma_pagesize ==
> PUD_SIZE) and bisection led me here.
> 
> Is it intentional to ignore the fault_ipa adjustment when calculating
> gfn if the guest memory is backed by hugetlbfs? This looks broken for
> the non-nested case.
> 
> But since I haven't looked at user_mem_abort() for a long time, I'm not
> sure if I'd missed something...

Nope, you're spot on as usual.

Seems like we'd want to make sure both the canonical IPA and fault IPA
are hugepage-aligned to get the right PFN and map it at the right place.

I repro'ed the boot failure, the following diff gets me back in
business. I was _just_ about to send the second batch of fixes, but this
is a rather smelly one.

Unless someone screams, this is getting stuffed on top.

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 6981b1bc0946..a509b63bd4dd 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1540,8 +1540,15 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		vma_pagesize = min(vma_pagesize, (long)max_map_size);
 	}
 
-	if (vma_pagesize == PMD_SIZE || vma_pagesize == PUD_SIZE)
+	/*
+	 * Both the canonical IPA and fault IPA must be hugepage-aligned to
+	 * ensure we find the right PFN and lay down the mapping in the right
+	 * place.
+	 */
+	if (vma_pagesize == PMD_SIZE || vma_pagesize == PUD_SIZE) {
 		fault_ipa &= ~(vma_pagesize - 1);
+		ipa &= ~(vma_pagesize - 1);
+	}
 
 	gfn = ipa >> PAGE_SHIFT;
 	mte_allowed = kvm_vma_mte_allowed(vma);


-- 
Thanks,
Oliver




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux