On Tue, 21 Dec 2021 at 11:39, Marek Szyprowski <m.szyprowski@xxxxxxxxxxx> wrote: > > Hi, > > On 22.11.2021 10:28, Ard Biesheuvel wrote: > > Wire up the generic support for managing task stack allocations via vmalloc, > > and implement the entry code that detects whether we faulted because of a > > stack overrun (or future stack overrun caused by pushing the pt_regs array) > > > > While this adds a fair amount of tricky entry asm code, it should be > > noted that it only adds a TST + branch to the svc_entry path. The code > > implementing the non-trivial handling of the overflow stack is emitted > > out-of-line into the .text section. > > > > Since on ARM, we rely on do_translation_fault() to keep PMD level page > > table entries that cover the vmalloc region up to date, we need to > > ensure that we don't hit such a stale PMD entry when accessing the > > stack. So we do a dummy read from the new stack while still running from > > the old one on the context switch path, and bump the vmalloc_seq counter > > when PMD level entries in the vmalloc range are modified, so that the MM > > switch fetches the latest version of the entries. > > > > Note that we need to increase the per-mode stack by 1 word, to gain some > > space to stash a GPR until we know it is safe to touch the stack. > > However, due to the cacheline alignment of the struct, this does not > > actually increase the memory footprint of the struct stack array at all. > > > > Signed-off-by: Ard Biesheuvel <ardb@xxxxxxxxxx> > > Tested-by: Keith Packard <keithpac@xxxxxxxxxx> > > > This patch landed recently in linux-next 20211220 as commit a1c510d0adc6 > ("ARM: implement support for vmap'ed stacks"). Sadly it breaks > suspend/resume operation on all ARM 32bit Exynos SoCs. Probably the > suspend/resume related code must be updated somehow (it partially works > on physical addresses and disabled MMU), but I didn't analyze it yet. If > you have any hints, let me know. > Are there any such systems in KernelCI? We caught a suspend/resume related issue in development, which is why the hunk below was added. In general, any virt-to-phys translation involving and address on the stack will become problematic. Could you please confirm whether the issue persists with the patch applied but with CONFIG_VMAP_STACK turned off? Just so we know we are looking in the right place? > diff --git a/arch/arm/kernel/sleep.S b/arch/arm/kernel/sleep.S > index 43077e11dafd..803b51e5cba0 100644 > --- a/arch/arm/kernel/sleep.S > +++ b/arch/arm/kernel/sleep.S > @@ -67,6 +67,14 @@ ENTRY(__cpu_suspend) > ldr r4, =cpu_suspend_size > #endif > mov r5, sp @ current virtual SP > +#ifdef CONFIG_VMAP_STACK > + @ Run the suspend code from the overflow stack so we don't have to rely > + @ on vmalloc-to-phys conversions anywhere in the arch suspend code. > + @ The original SP value captured in R5 will be restored on the way out. > + mov_l r6, overflow_stack_ptr @ Base pointer > + mrc p15, 0, r7, c13, c0, 4 @ Get per-CPU offset > + ldr sp, [r6, r7] @ Address of this CPU's overflow stack > +#endif > add r4, r4, #12 @ Space for pgd, virt sp, phys resume fn > sub sp, sp, r4 @ allocate CPU state on stack > ldr r3, =sleep_save_sp