Re: [PATCH v2 12/14] mm: Generalize arch_sync_kernel_mappings()

Catalin Marinas <catalin.marinas@xxxxxxx> · Tue, 25 Feb 2025 17:52:25 +0000

On Tue, Feb 25, 2025 at 05:10:10PM +0000, Ryan Roberts wrote:
> On 17/02/2025 14:08, Ryan Roberts wrote:
> > arch_sync_kernel_mappings() is an optional hook for arches to allow them
> > to synchonize certain levels of the kernel pgtables after modification.
> > But arm64 could benefit from a hook similar to this, paired with a call
> > prior to starting the batch of modifications.
> > 
> > So let's introduce arch_update_kernel_mappings_begin() and
> > arch_update_kernel_mappings_end(). Both have a default implementation
> > which can be overridden by the arch code. The default for the former is
> > a nop, and the default for the latter is to call
> > arch_sync_kernel_mappings(), so the latter replaces previous
> > arch_sync_kernel_mappings() callsites. So by default, the resulting
> > behaviour is unchanged.
> 
> Thanks to Kevin Brodsky; after some discussion we realised that while this works
> on arm64 today, it isn't really robust in general.
[...]
> As an alternative, I'm proposing to remove this change (keeping
> arch_sync_kernel_mappings() as it was), and instead start wrapping the vmap pte
> table walker functions with
> arch_enter_lazy_mmu_mode()/arch_exit_lazy_mmu_mode().

I came to the same conclusion why looking at the last three patches. I'm
also not a fan of relying on a TIF flag for batching.

> These have a smaller scope
> so there is no risk of the nesting (pgtable allocations happen outside the
> scope). arm64 will then use these lazy mmu hooks for it's purpose of deferring
> barriers. There might be a small amount of performance loss due to the reduced
> scope, but I'm guessing most of the performance is in batching the operations of
> a single pte table.
> 
> One wrinkle is that arm64 needs to know if we are operating on kernel or user
> mappings in lazy mode. The lazy_mmu hooks apply to both kernel and user
> mappings, unlike my previous method which were kernel only. So I'm proposing to
> pass mm to arch_enter_lazy_mmu_mode().

Note that we have the efi_mm that uses PAGE_KERNEL prot bits while your
code only checks for init_mm after patch 13.

-- 
Catalin