Re: [PATCH] KVM: x86/mmu: fix potential races when walking host page table

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 29, 2022, Paolo Bonzini wrote:
> On 4/29/22 16:35, Sean Christopherson wrote:
> > On Fri, Apr 29, 2022, Paolo Bonzini wrote:
> > > > +out:
> > > > +	local_irq_restore(flags);
> > > > +	return level;
> > > > +}
> > > > +EXPORT_SYMBOL_GPL(kvm_lookup_address_level_in_mm);
> > > 
> > > Exporting is not needed.
> > > 
> > > Thanks for writing the walk code though.  I'll adapt it and integrate the
> > > patch.
> > 
> > But why are we fixing this only in KVM?  I liked the idea of stealing perf's
> > implementation because it was a seemlingly perfect fit and wouldn't introduce
> > new code (ignoring wrappers, etc...).
> > 
> > We _know_ that at least one subsystem is misusing lookup_address_in_pgd() and
> > given that its wrappers are exported, I highly doubt KVM is the only offender.
> > It really feels like we're passing the buck here by burying the fix in KVM.
> 
> There are two ways to do it:
> 
> * having a generic function in mm/.  The main issue there is the lack of a
> PG_LEVEL_{P4D,PUD,PMD,PTE} enum at the mm/ level.  We could use (ctz(x) -
> 12) / 9 to go from size to level, but it's ugly and there could be
> architectures with heterogeneous page table sizes.
> 
> * having a generic function in arch/x86/.  In this case KVM seems to be the
> odd one that doesn't need the PTE.  For example vc_slow_virt_to_phys needs
> the PTE, and needs the size rather than the "level" per se.
> 
> So for now I punted, while keeping open the door for moving code from
> arch/x86/kvm/ to mm/ if anyone else (even other KVM ports) need the same
> logic.

Ugh.  I was going to say that KVM is the only in-tree user that's subtly broken,
but then I saw vc_slow_virt_to_phys()...  So there's at least one other use case
for walking user addresses and being able to tolerate a not-present mapping.

There are no other users of lookup_address_in_mm(), and other than SEV-ES's
dastardly use of lookup_address_in_pgd(), pgd can only ever come from init_mm or
efi_mm, i.e. can't work with user address anyways.

If we go the KVM-only route, can send the below revert along with it?  The least
we can do is not give others an easy way to screw up.

Until I saw the #VC crud, I was hoping we could also explicitly prevent using
lookup_address_in_pgd() with user addresses.  If/when #VC is fixed, we can/should
add this:

@@ -592,6 +592,15 @@ pte_t *lookup_address_in_pgd(pgd_t *pgd, unsigned long address,

 	*level = PG_LEVEL_NONE;

+	/*
+	 * The below walk does not guard against user page tables being torn
+	 * down, attempting to walk a user address is dangerous and likely to
+	 * explode sooner or later.  This helper is intended only for use with
+	 * kernel-only mm_structs, e.g. init_mm and efi_mm.
+	 */
+	if (WARN_ON_ONCE(address < TASK_SIZE_MAX))
+		return NULL;
+


From: Sean Christopherson <seanjc@xxxxxxxxxx>
Date: Fri, 29 Apr 2022 07:57:53 -0700
Subject: [PATCH] Revert "x86/mm: Introduce lookup_address_in_mm()"

Drop lookup_address_in_mm() now that KVM is providing it's own variant
of lookup_address_in_pgd() that is safe for use with user addresses, e.g.
guards against page tables being torn down.  A variant that provides a
non-init mm is inherently dangerous and flawed, as the only reason to use
an mm other than init_mm is to walk a userspace mapping, and
lookup_address_in_pgd() does not play nice with userspace mappings, e.g.
doesn't disable IRQs to block TLB shootdowns and doesn't use READ_ONCE()
to ensure an upper level entry isn't converted to a huge page between
checking the PAGE_SIZE bit and grabbing the address of the next level
down.

This reverts commit 13c72c060f1ba6f4eddd7b1c4f52a8aded43d6d9.

Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
---
 arch/x86/include/asm/pgtable_types.h |  4 ----
 arch/x86/mm/pat/set_memory.c         | 11 -----------
 2 files changed, 15 deletions(-)

diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
index 40497a9020c6..407084d9fd99 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -559,10 +559,6 @@ static inline void update_page_count(int level, unsigned long pages) { }
 extern pte_t *lookup_address(unsigned long address, unsigned int *level);
 extern pte_t *lookup_address_in_pgd(pgd_t *pgd, unsigned long address,
 				    unsigned int *level);
-
-struct mm_struct;
-extern pte_t *lookup_address_in_mm(struct mm_struct *mm, unsigned long address,
-				   unsigned int *level);
 extern pmd_t *lookup_pmd_address(unsigned long address);
 extern phys_addr_t slow_virt_to_phys(void *__address);
 extern int __init kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn,
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index abf5ed76e4b7..0656db33574d 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -638,17 +638,6 @@ pte_t *lookup_address(unsigned long address, unsigned int *level)
 }
 EXPORT_SYMBOL_GPL(lookup_address);

-/*
- * Lookup the page table entry for a virtual address in a given mm. Return a
- * pointer to the entry and the level of the mapping.
- */
-pte_t *lookup_address_in_mm(struct mm_struct *mm, unsigned long address,
-			    unsigned int *level)
-{
-	return lookup_address_in_pgd(pgd_offset(mm, address), address, level);
-}
-EXPORT_SYMBOL_GPL(lookup_address_in_mm);
-
 static pte_t *_lookup_address_cpa(struct cpa_data *cpa, unsigned long address,
 				  unsigned int *level)
 {

base-commit: 6f363ed2fa4c24c400acc29b659c96e4dc7930e8
--






[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux