Hi, On 21.12.2021 14:34, Ard Biesheuvel wrote: > On Tue, 21 Dec 2021 at 12:15, Marek Szyprowski <m.szyprowski@xxxxxxxxxxx> wrote: >> Hi Ard, >> >> On 21.12.2021 11:44, Ard Biesheuvel wrote: >>> On Tue, 21 Dec 2021 at 11:39, Marek Szyprowski <m.szyprowski@xxxxxxxxxxx> wrote: >>>> On 22.11.2021 10:28, Ard Biesheuvel wrote: >>>>> Wire up the generic support for managing task stack allocations via vmalloc, >>>>> and implement the entry code that detects whether we faulted because of a >>>>> stack overrun (or future stack overrun caused by pushing the pt_regs array) >>>>> >>>>> While this adds a fair amount of tricky entry asm code, it should be >>>>> noted that it only adds a TST + branch to the svc_entry path. The code >>>>> implementing the non-trivial handling of the overflow stack is emitted >>>>> out-of-line into the .text section. >>>>> >>>>> Since on ARM, we rely on do_translation_fault() to keep PMD level page >>>>> table entries that cover the vmalloc region up to date, we need to >>>>> ensure that we don't hit such a stale PMD entry when accessing the >>>>> stack. So we do a dummy read from the new stack while still running from >>>>> the old one on the context switch path, and bump the vmalloc_seq counter >>>>> when PMD level entries in the vmalloc range are modified, so that the MM >>>>> switch fetches the latest version of the entries. >>>>> >>>>> Note that we need to increase the per-mode stack by 1 word, to gain some >>>>> space to stash a GPR until we know it is safe to touch the stack. >>>>> However, due to the cacheline alignment of the struct, this does not >>>>> actually increase the memory footprint of the struct stack array at all. >>>>> >>>>> Signed-off-by: Ard Biesheuvel <ardb@xxxxxxxxxx> >>>>> Tested-by: Keith Packard <keithpac@xxxxxxxxxx> >>>> This patch landed recently in linux-next 20211220 as commit a1c510d0adc6 >>>> ("ARM: implement support for vmap'ed stacks"). Sadly it breaks >>>> suspend/resume operation on all ARM 32bit Exynos SoCs. Probably the >>>> suspend/resume related code must be updated somehow (it partially works >>>> on physical addresses and disabled MMU), but I didn't analyze it yet. If >>>> you have any hints, let me know. >>>> >>> Are there any such systems in KernelCI? We caught a suspend/resume >>> related issue in development, which is why the hunk below was added. >> >> I think that some Exynos-based Odroids (U3 and XU3) were some time ago >> available in KernelCI, but I don't know if they are still there. >> >> >>> In general, any virt-to-phys translation involving and address on the >>> stack will become problematic. >>> >>> Could you please confirm whether the issue persists with the patch >>> applied but with CONFIG_VMAP_STACK turned off? Just so we know we are >>> looking in the right place? >> >> I've just checked. After disabling CONFIG_VMAP_STACK suspend/resume >> works fine both on commit a1c510d0adc6 and linux-next 20211220. >> > Thanks. Any other context you can provide beyond 'does not work' ? Well, the board properly suspends, but it doesn't wake then (tested remotely with rtcwake command). So far I cannot provide anything more. Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland