Re: [PATCH v4 7/7] ARM: implement support for vmap'ed stacks

Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> · Tue, 28 Dec 2021 17:12:54 +0100

On Tue, Dec 28, 2021 at 3:39 PM Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote:
> On Thu, Dec 23, 2021 at 3:30 PM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote:
> > On Tue, 21 Dec 2021 at 22:56, Marek Szyprowski <m.szyprowski@xxxxxxxxxxx> wrote:
> > > On 21.12.2021 17:20, Ard Biesheuvel wrote:
> > > > On Tue, 21 Dec 2021 at 14:51, Marek Szyprowski <m.szyprowski@xxxxxxxxxxx> wrote:
> > > >> On 21.12.2021 14:34, Ard Biesheuvel wrote:
> > > >>> On Tue, 21 Dec 2021 at 12:15, Marek Szyprowski <m.szyprowski@xxxxxxxxxxx> wrote:
> > > >>>> On 21.12.2021 11:44, Ard Biesheuvel wrote:
> > > >>>>> On Tue, 21 Dec 2021 at 11:39, Marek Szyprowski <m.szyprowski@xxxxxxxxxxx> wrote:
> > > >>>>>> On 22.11.2021 10:28, Ard Biesheuvel wrote:
> > > >>>>>>> Wire up the generic support for managing task stack allocations via vmalloc,
> > > >>>>>>> and implement the entry code that detects whether we faulted because of a
> > > >>>>>>> stack overrun (or future stack overrun caused by pushing the pt_regs array)
> > > >>>>>>>
> > > >>>>>>> While this adds a fair amount of tricky entry asm code, it should be
> > > >>>>>>> noted that it only adds a TST + branch to the svc_entry path. The code
> > > >>>>>>> implementing the non-trivial handling of the overflow stack is emitted
> > > >>>>>>> out-of-line into the .text section.
> > > >>>>>>>
> > > >>>>>>> Since on ARM, we rely on do_translation_fault() to keep PMD level page
> > > >>>>>>> table entries that cover the vmalloc region up to date, we need to
> > > >>>>>>> ensure that we don't hit such a stale PMD entry when accessing the
> > > >>>>>>> stack. So we do a dummy read from the new stack while still running from
> > > >>>>>>> the old one on the context switch path, and bump the vmalloc_seq counter
> > > >>>>>>> when PMD level entries in the vmalloc range are modified, so that the MM
> > > >>>>>>> switch fetches the latest version of the entries.
> > > >>>>>>>
> > > >>>>>>> Note that we need to increase the per-mode stack by 1 word, to gain some
> > > >>>>>>> space to stash a GPR until we know it is safe to touch the stack.
> > > >>>>>>> However, due to the cacheline alignment of the struct, this does not
> > > >>>>>>> actually increase the memory footprint of the struct stack array at all.
> > > >>>>>>>
> > > >>>>>>> Signed-off-by: Ard Biesheuvel <ardb@xxxxxxxxxx>
> > > >>>>>>> Tested-by: Keith Packard <keithpac@xxxxxxxxxx>
> > > >>>>>> This patch landed recently in linux-next 20211220 as commit a1c510d0adc6
> > > >>>>>> ("ARM: implement support for vmap'ed stacks"). Sadly it breaks
> > > >>>>>> suspend/resume operation on all ARM 32bit Exynos SoCs. Probably the
> > > >>>>>> suspend/resume related code must be updated somehow (it partially works
> > > >>>>>> on physical addresses and disabled MMU), but I didn't analyze it yet. If
> > > >>>>>> you have any hints, let me know.
> > > >>>>>>
> > > >>>>> Are there any such systems in KernelCI? We caught a suspend/resume
> > > >>>>> related issue in development, which is why the hunk below was added.
> > > >>>> I think that some Exynos-based Odroids (U3 and XU3) were some time ago
> > > >>>> available in KernelCI, but I don't know if they are still there.
> > > >>>>
> > > >>>>
> > > >>>>> In general, any virt-to-phys translation involving and address on the
> > > >>>>> stack will become problematic.
> > > >>>>>
> > > >>>>> Could you please confirm whether the issue persists with the patch
> > > >>>>> applied but with CONFIG_VMAP_STACK turned off? Just so we know we are
> > > >>>>> looking in the right place?
> > > >>>> I've just checked. After disabling CONFIG_VMAP_STACK suspend/resume
> > > >>>> works fine both on commit a1c510d0adc6 and linux-next 20211220.
> > > >>>>
> > > >>> Thanks. Any other context you can provide beyond 'does not work' ?
> > > >> Well, the board properly suspends, but it doesn't wake then (tested
> > > >> remotely with rtcwake command). So far I cannot provide anything more.
> > > >>
> > > > Thanks. Does the below help? Or otherwise, could you try doubling the
> > > > size of the overflow stack at arch/arm/include/asm/thread_info.h:34?
> > >
> > > I've tried both (but not at the same time) on the current linux-next and
> > > none helped. This must be something else... :/
> > >
> >
> > Thanks.
> >
> > As i don't have access to this hardware, I am going to have to rely on
> > someone who does to debug this further. The only alternative is
> > marking CONFIG_VMAP_STACK broken on MACH_EXYNOS but that would be
> > unfortunate.
>
> Wish I had seen this thread before...
>
> I've just bisected a resume after s2ram failure on R-Car Gen2 to the same
> commit a1c510d0adc604bb ("ARM: implement support for vmap'ed stacks")
> in arm/for-next.
>
> Expected output:
>
>     PM: suspend entry (deep)
>     Filesystems sync: 0.000 seconds
>     Freezing user space processes ... (elapsed 0.010 seconds) done.
>     OOM killer disabled.
>     Freezing remaining freezable tasks ... (elapsed 0.009 seconds) done.
>     Disabling non-boot CPUs ...
>
> [system suspended, this is also where it hangs on failure]
>
>     Enabling non-boot CPUs ...
>     CPU1 is up
>     sh-eth ee700000.ethernet eth0: Link is Down
>     Micrel KSZ8041RNLI ee700000.ethernet-ffffffff:01: attached PHY
> driver (mii_bus:phy_addr=ee700000.ethernet-ffffffff:01, irq=193)
>     OOM killer enabled.
>     Restarting tasks ... done.
>     PM: suspend exit
>
> Both wake-on-LAN and wake-up by gpio-keys fail.
> Nothing interesting in the kernel log, cfr. above.
>
> Disabling CONFIG_VMAP_STACK fixes the issue for me.

Enabling CONFIG_ARM_LPAE also fixes the issue, but is not an option
for shmobile_defconfig, as that would break systems with a Cortex-A9.

> Just like arch/arm/mach-exynos/ (and others), arch/arm/mach-shmobile/
> has several *.S files related to secondary CPU bringup.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds