On Tue, Sep 26, 2023 at 9:09 AM Masahiro Yamada <masahiroy@xxxxxxxxxx> wrote: > > The 32-bit ARM kernel stops working if the kernel grows to the point > where veneers for __get_user_* are created. > > AAPCS32 [1] states, "Register r12 (IP) may be used by a linker as a > scratch register between a routine and any subroutine it calls. It > can also be used within a routine to hold intermediate values between > subroutine calls." > > However, bl instructions buried within the inline asm are unpredictable > for compilers; hence, "ip" must be added to the clobber list. > > This becomes critical when veneers for __get_user_* are created because > veneers use the ip register since commit 02e541db0540 ("ARM: 8323/1: > force linker to use PIC veneers"). > > [1]: https://github.com/ARM-software/abi-aa/blob/2023Q1/aapcs32/aapcs32.rst > > Signed-off-by: Masahiro Yamada <masahiroy@xxxxxxxxxx> > Reviewed-by: Ard Biesheuvel <ardb@xxxxxxxxxx> + stable@xxxxxxxxxxxxxxx It seems like this (commit 24d3ba0a7b44c1617c27f5045eecc4f34752ab03 upstream) would be a good candidate for -stable? The issue it fixes can manifest in lots of very strange ways, so it would be good to avoid others getting tripped up by it on -stable branches. (Apologies for being a bit verbose in the following, I've included a lot of details and breadcrumbs so others might find this if they run into the same issues.) I was recently looking into an arm32 issue, and found getting a custom built kernel consistently working in qemu-system-arm to bisect issues in the range of 5.15-6.6 was a bit difficult, as I would hit a couple different odd errors. For 5.15 I was seeing systemd fail to start in a fairly opaque way: starting systemd-udevd.service - Rule-based Manager for Device Events and Files. systemd-udevd.service: Main process exited, code=exited, status=1/FAILURE systemd-udevd.service: Failed with result 'exit-code'. Failed to start systemd-udevd.service - Rule-based Manager for Device Events and Files. But further looking through the logs I found: systemd[1]: Failed to open netlink: Operation not permitted Despite lots of digging to try to understand what was going wrong, the one thing that worked was switching to CONFIG_CC_OPTIMIZE_FOR_SIZE (which I only tried as I came across this old thread: https://lists.yoctoproject.org/g/linux-yocto/message/8035 ), this seemed very suspicious, but I didn't have a lot of time to dig further. That resolved things until ~6.1, where I started seeing crashes at init: [ 16.982562] Run /init as init process [ 16.989311] Failed to execute /init (error -22) [ 16.990017] Run /sbin/init as init process [ 16.994737] Starting init: /sbin/init exists but couldn't execute it (error -22) That I bisected that failure down to being supposedly caused by commit 5750121ae738 ("kbuild: list sub-directories in ./Kbuild") https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5750121ae7382ebac8d47ce6d68012d6cd1d7926 And searching around that commit luckily led me to this change, which finally seems to resolve the different issues I saw for 6.6, 6.1 and 5.15! Now, In my rush to get something booting with qemu, I started with the debian config but disabled modules, and didn't put much time into getting rid of config options or drivers I wouldn't need. So the kernel is pretty large. So maybe not super common, but I definitely wouldn't want others to have to go down this debugging rabbit hole. thanks -john