On Mon, Mar 28, 2022 at 11:15 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > On Mon, Mar 28, 2022, Mingwei Zhang wrote: > > With that, I start to feel this is a bug. The issue is just so rare > > that it has never triggered a problem. > > > > lookup_address_in_mm() walks the host page table as if it is a > > sequence of _static_ memory chunks. This is clearly dangerous. > > Yeah, it's broken. The proper fix is do something like what perf uses, or maybe > just genericize and reuse the code from commit 8af26be06272 > ("perf/core: Fix arch_perf_get_page_size()). hmm, I am thinking about this. We clearly need an adaptor layer if we choose to use this function, e.g., size -> layer change; using irq or not. Alternatively, I am wondering if we can just modify lookup_address_in_mm() to make the code compatible with "lockless" walk? On top of that, since kvm_mmu_max_mapping_level() is used in two places: 1) ept violation and 2) disabling dirty logging. The former does not require disable/enable irq since it is safe. So maybe add a parameter in this function and plumb through towards host_pfn_mapping_level()? > > > But right now, kvm_mmu_max_mapping_level() are used in other places > > as well: kvm_mmu_zap_collapsible_spte(), which does not satisfy the > > strict requirement of walking the host page table. > > The host pfn size is used only as a hueristic, so false postives/negatives are > ok, the only race that needs to be avoided is dereferencing freed page table > memory. lookup_address_in_pgd() is really broken because it doesn't even ensure > a given PxE is READ_ONCE(). I suppose one could argue the caller is broken, but > I doubt KVM is the only user that doesn't provide the necessary protections. right. since lookup_address_in_pgd() is so broken. I am thinking about just fix it in place instead of switching to a different function.