On Fri, Apr 29, 2022, Paolo Bonzini wrote: > On 4/29/22 16:35, Sean Christopherson wrote: > > On Fri, Apr 29, 2022, Paolo Bonzini wrote: > > > > +out: > > > > + local_irq_restore(flags); > > > > + return level; > > > > +} > > > > +EXPORT_SYMBOL_GPL(kvm_lookup_address_level_in_mm); > > > > > > Exporting is not needed. > > > > > > Thanks for writing the walk code though. I'll adapt it and integrate the > > > patch. > > > > But why are we fixing this only in KVM? I liked the idea of stealing perf's > > implementation because it was a seemlingly perfect fit and wouldn't introduce > > new code (ignoring wrappers, etc...). > > > > We _know_ that at least one subsystem is misusing lookup_address_in_pgd() and > > given that its wrappers are exported, I highly doubt KVM is the only offender. > > It really feels like we're passing the buck here by burying the fix in KVM. > > There are two ways to do it: > > * having a generic function in mm/. The main issue there is the lack of a > PG_LEVEL_{P4D,PUD,PMD,PTE} enum at the mm/ level. We could use (ctz(x) - > 12) / 9 to go from size to level, but it's ugly and there could be > architectures with heterogeneous page table sizes. > > * having a generic function in arch/x86/. In this case KVM seems to be the > odd one that doesn't need the PTE. For example vc_slow_virt_to_phys needs > the PTE, and needs the size rather than the "level" per se. > > So for now I punted, while keeping open the door for moving code from > arch/x86/kvm/ to mm/ if anyone else (even other KVM ports) need the same > logic. Ugh. I was going to say that KVM is the only in-tree user that's subtly broken, but then I saw vc_slow_virt_to_phys()... So there's at least one other use case for walking user addresses and being able to tolerate a not-present mapping. There are no other users of lookup_address_in_mm(), and other than SEV-ES's dastardly use of lookup_address_in_pgd(), pgd can only ever come from init_mm or efi_mm, i.e. can't work with user address anyways. If we go the KVM-only route, can send the below revert along with it? The least we can do is not give others an easy way to screw up. Until I saw the #VC crud, I was hoping we could also explicitly prevent using lookup_address_in_pgd() with user addresses. If/when #VC is fixed, we can/should add this: @@ -592,6 +592,15 @@ pte_t *lookup_address_in_pgd(pgd_t *pgd, unsigned long address, *level = PG_LEVEL_NONE; + /* + * The below walk does not guard against user page tables being torn + * down, attempting to walk a user address is dangerous and likely to + * explode sooner or later. This helper is intended only for use with + * kernel-only mm_structs, e.g. init_mm and efi_mm. + */ + if (WARN_ON_ONCE(address < TASK_SIZE_MAX)) + return NULL; + From: Sean Christopherson <seanjc@xxxxxxxxxx> Date: Fri, 29 Apr 2022 07:57:53 -0700 Subject: [PATCH] Revert "x86/mm: Introduce lookup_address_in_mm()" Drop lookup_address_in_mm() now that KVM is providing it's own variant of lookup_address_in_pgd() that is safe for use with user addresses, e.g. guards against page tables being torn down. A variant that provides a non-init mm is inherently dangerous and flawed, as the only reason to use an mm other than init_mm is to walk a userspace mapping, and lookup_address_in_pgd() does not play nice with userspace mappings, e.g. doesn't disable IRQs to block TLB shootdowns and doesn't use READ_ONCE() to ensure an upper level entry isn't converted to a huge page between checking the PAGE_SIZE bit and grabbing the address of the next level down. This reverts commit 13c72c060f1ba6f4eddd7b1c4f52a8aded43d6d9. Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx> --- arch/x86/include/asm/pgtable_types.h | 4 ---- arch/x86/mm/pat/set_memory.c | 11 ----------- 2 files changed, 15 deletions(-) diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h index 40497a9020c6..407084d9fd99 100644 --- a/arch/x86/include/asm/pgtable_types.h +++ b/arch/x86/include/asm/pgtable_types.h @@ -559,10 +559,6 @@ static inline void update_page_count(int level, unsigned long pages) { } extern pte_t *lookup_address(unsigned long address, unsigned int *level); extern pte_t *lookup_address_in_pgd(pgd_t *pgd, unsigned long address, unsigned int *level); - -struct mm_struct; -extern pte_t *lookup_address_in_mm(struct mm_struct *mm, unsigned long address, - unsigned int *level); extern pmd_t *lookup_pmd_address(unsigned long address); extern phys_addr_t slow_virt_to_phys(void *__address); extern int __init kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c index abf5ed76e4b7..0656db33574d 100644 --- a/arch/x86/mm/pat/set_memory.c +++ b/arch/x86/mm/pat/set_memory.c @@ -638,17 +638,6 @@ pte_t *lookup_address(unsigned long address, unsigned int *level) } EXPORT_SYMBOL_GPL(lookup_address); -/* - * Lookup the page table entry for a virtual address in a given mm. Return a - * pointer to the entry and the level of the mapping. - */ -pte_t *lookup_address_in_mm(struct mm_struct *mm, unsigned long address, - unsigned int *level) -{ - return lookup_address_in_pgd(pgd_offset(mm, address), address, level); -} -EXPORT_SYMBOL_GPL(lookup_address_in_mm); - static pte_t *_lookup_address_cpa(struct cpa_data *cpa, unsigned long address, unsigned int *level) { base-commit: 6f363ed2fa4c24c400acc29b659c96e4dc7930e8 --