On Tue, 30 Jul 2024 at 11:14, <gregkh@xxxxxxxxxxxxxxxxxxx> wrote: > > > This is a note to let you know that I've just added the patch titled > > arm64: mm: Fix lockless walks with static and dynamic page-table folding > > to the 6.1-stable tree which can be found at: > http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary > > The filename of the patch is: > arm64-mm-fix-lockless-walks-with-static-and-dynamic-page-table-folding.patch > and it can be found in the queue-6.1 subdirectory. > > If you, or anyone else, feels it should not be added to the stable tree, > please let <stable@xxxxxxxxxxxxxxx> know about it. > Pleaee drop this from the v6.1 queue. Why is this being considered for v6.1 in the first place? The fixes tag mentions a patch that was introduced in v6.9 > > From 36639013b3462c06ff8e3400a427f775b4fc97f5 Mon Sep 17 00:00:00 2001 > From: Will Deacon <will@xxxxxxxxxx> > Date: Thu, 25 Jul 2024 10:03:45 +0100 > Subject: arm64: mm: Fix lockless walks with static and dynamic page-table folding > > From: Will Deacon <will@xxxxxxxxxx> > > commit 36639013b3462c06ff8e3400a427f775b4fc97f5 upstream. > > Lina reports random oopsen originating from the fast GUP code when > 16K pages are used with 4-level page-tables, the fourth level being > folded at runtime due to lack of LPA2. > > In this configuration, the generic implementation of > p4d_offset_lockless() will return a 'p4d_t *' corresponding to the > 'pgd_t' allocated on the stack of the caller, gup_fast_pgd_range(). > This is normally fine, but when the fourth level of page-table is folded > at runtime, pud_offset_lockless() will offset from the address of the > 'p4d_t' to calculate the address of the PUD in the same page-table page. > This results in a stray stack read when the 'p4d_t' has been allocated > on the stack and can send the walker into the weeds. > > Fix the problem by providing our own definition of p4d_offset_lockless() > when CONFIG_PGTABLE_LEVELS <= 4 which returns the real page-table > pointer rather than the address of the local stack variable. > > Cc: Catalin Marinas <catalin.marinas@xxxxxxx> > Cc: Ard Biesheuvel <ardb@xxxxxxxxxx> > Cc: stable@xxxxxxxxxxxxxxx > Link: https://lore.kernel.org/r/50360968-13fb-4e6f-8f52-1725b3177215@xxxxxxxxxxxxx > Fixes: 0dd4f60a2c76 ("arm64: mm: Add support for folding PUDs at runtime") > Reported-by: Asahi Lina <lina@xxxxxxxxxxxxx> > Reviewed-by: Ard Biesheuvel <ardb@xxxxxxxxxx> > Link: https://lore.kernel.org/r/20240725090345.28461-1-will@xxxxxxxxxx > Signed-off-by: Will Deacon <will@xxxxxxxxxx> > Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> > --- > arch/arm64/include/asm/pgtable.h | 22 ++++++++++++++++++++++ > 1 file changed, 22 insertions(+) > > --- a/arch/arm64/include/asm/pgtable.h > +++ b/arch/arm64/include/asm/pgtable.h > @@ -1065,6 +1065,28 @@ static inline bool pgtable_l5_enabled(vo > > #define p4d_offset_kimg(dir,addr) ((p4d_t *)dir) > > +static inline > +p4d_t *p4d_offset_lockless_folded(pgd_t *pgdp, pgd_t pgd, unsigned long addr) > +{ > + /* > + * With runtime folding of the pud, pud_offset_lockless() passes > + * the 'pgd_t *' we return here to p4d_to_folded_pud(), which > + * will offset the pointer assuming that it points into > + * a page-table page. However, the fast GUP path passes us a > + * pgd_t allocated on the stack and so we must use the original > + * pointer in 'pgdp' to construct the p4d pointer instead of > + * using the generic p4d_offset_lockless() implementation. > + * > + * Note: reusing the original pointer means that we may > + * dereference the same (live) page-table entry multiple times. > + * This is safe because it is still only loaded once in the > + * context of each level and the CPU guarantees same-address > + * read-after-read ordering. > + */ > + return p4d_offset(pgdp, addr); > +} > +#define p4d_offset_lockless p4d_offset_lockless_folded > + > #endif /* CONFIG_PGTABLE_LEVELS > 4 */ > > #define pgd_ERROR(e) \ > > > Patches currently in stable-queue which might be from will@xxxxxxxxxx are > > queue-6.1/iommu-vt-d-fix-identity-map-bounds-in-si_domain_init.patch