[REGRESSION]: pbuilder random crashes on 6.1.y x86 with ARM64 compiles

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

there seems to be a subtle regression with 6.1.y kernels. I had random crashes with pbuilder
running on 64bit x86 (Intel HW, but happens also inside VMs) after Debian stable used
6.1.115. On the first glance, this looks like the usual GCC seg fault crash because of faulty hardware:

...
ENABLE_TRF_FOR_NS=0 -DENCRYPT_BL31=0 -DENCRYPT_BL32=0 -DERRATA_SPECULATIVE_AT=0 -DERROR
_DEPRECATED=0 -DFAULT_INJECTION_SUPPORT=0 -DGICV2_G0_FOR_EL3=1 -DHANDLE_EA_EL3_FIRST=0
-DHW_ASSISTED_COHERENCY=0 -DLOG_LEVEL=40 -DMEASURED_BOOT=0 -DNR_OF_FW_BANKS=2 -DNR_OF_I
MAGES_IN_FW_BANK=1 -DNS_TIMER_SWITCH=0 -DPL011_GENERIC_UART=0 -DPLAT_zynqmp -DPROGRAMMA
BLE_RESET_ADDRESS=1 -DPSA_FWU_SUPPORT=0 -DPSCI_EXTENDED_STATE_ID=1 -DRAS_EXTENSION=0 -D
RAS_TRAP_LOWER_EL_ERR_ACCESS=0 -DRECLAIM_INIT_CODE=0 -DRESET_TO_BL31=1 -DSDEI_IN_FCONF=
0 -DSEC_INT_DESC_IN_FCONF=0 -DSEPARATE_CODE_AND_RODATA=1 -DSEPARATE_NOBITS_REGION=0 -DS
PD_none -DSPIN_ON_BL1_EXIT=0 -DSPMD_SPM_AT_SEL2=1 -DSPM_MM=0 -DTRNG_SUPPORT=0 -DTRUSTED
_BOARD_BOOT=0 -DUSE_COHERENT_MEM=1 -DUSE_DEBUGFS=0 -DUSE_ROMLIB=0 -DUSE_SP804_TIMER=0 -
DUSE_SPINLOCK_CAS=0 -DUSE_TBBR_DEFS=1 -DWARMBOOT_ENABLE_DCACHE_EARLY=1 -Iinclude -Iincl
ude/arch/aarch64 -Iinclude/lib/cpus/aarch64 -Iinclude/lib/el3_runtime/aarch64 -Iinclude
/plat/arm/common/ -Iinclude/plat/arm/common/aarch64/ -Iplat/xilinx/common/include/ -Iplat/xilinx/common/ipi_mailbox_service/ -Iplat/xilinx/zynqmp/include/ -Iplat/xilinx/zynqmp/pm_service/   -Iinclude/lib/libfdt -Iinclude/lib/libc -Iinclude/lib/libc/aarch64   -nostdinc -Werror -Wall -Wmissing-include-dirs -Wunused -Wdisabled-optimization -Wvla -Wshadow -Wno-unused-parameter -Wredundant-decls -Wunused-but-set-variable -Wmaybe-uninitialized -Wpacked-bitfield-compat -Wshift-overflow=2 -Wlogical-op -Wno-error=deprecated-declarations -Wno-error=cpp -march=armv8-a -mgeneral-regs-only -mstrict-align -mbranch-protection=none -ffunction-sections -fdata-sections -ffreestanding -fno-builtin -fno-common -Os -std=gnu99 -fno-PIE -fno-stack-protector  -fno-jump-tables -DIMAGE_AT_EL3 -DIMAGE_BL31  -Wp,-MD,/build/arm-trusted-firmware-kk-2.6-2022-2-kk/build/zynqmp/release/bl31/plat_psci.d -MT /build/arm-trusted-firmware-kk-2.6-2022-2-kk/build/zynqmp/release/bl31/plat_psci.o -MP -c plat/xilinx/zynqmp/plat_psci.c -o /build/arm-trusted-firmware-kk-2.6-2022-2-kk/build/zynqmp/release/bl31/plat_psci.o
make[2]: *** [Makefile:1251: /build/arm-trusted-firmware-kk-2.6-2022-2-kk/build/zynqmp/release/bl31/plat_psci.o] Segmentation fault
make[2]: *** Waiting for unfinished jobs....
...

(That's a pbuilder build of the ARM trusted firmware, but it crashes with any other ARM64 application build
with pbuilder sooner or later - but NOT on the first or second run, usually after the third or fifth run)

However, the crashes were going away again when I switched back to 6.1.112 (the previous debian stable kernel).
I've git bisected it down to this commit:

b0cde867b80a5e81fcbc0383e138f5845f2005ee is the first bad commit
commit b0cde867b80a5e81fcbc0383e138f5845f2005ee
Author: Kees Cook <keescook@xxxxxxxxxxxx>
Date:   Fri Feb 16 22:25:43 2024 -0800
    x86: Increase brk randomness entropy for 64-bit systems
    [ Upstream commit 44c76825d6eefee9eb7ce06c38e1a6632ac7eb7d ]
    In commit c1d171a00294 ("x86: randomize brk"), arch_randomize_brk() was
    defined to use a 32MB range (13 bits of entropy), but was never increased
    when moving to 64-bit. The default arch_randomize_brk() uses 32MB for
    32-bit tasks, and 1GB (18 bits of entropy) for 64-bit tasks.
    Update x86_64 to match the entropy used by arm64 and other 64-bit
    architectures.
    Reported-by: y0un9n132@xxxxxxxxx
    Signed-off-by: Kees Cook <keescook@xxxxxxxxxxxx>
    Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
    Acked-by: Jiri Kosina <jkosina@xxxxxxxx>
    Closes: https://lore.kernel.org/linux-hardening/CA+2EKTVLvc8hDZc+2Yhwmus=dzOUG5E4gV7ayCbu0MPJTZzWkw@xxxxxxxxxxxxxx/
    Link: https://lore.kernel.org/r/20240217062545.1631668-1-keescook@xxxxxxxxxxxx
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

If I revert that commit, like:

-------------------------- arch/x86/kernel/process.c --------------------------
index acc83738bf5b..279b5e9be80f 100644
@@ -991,10 +991,7 @@ unsigned long arch_align_stack(unsigned long sp)

 unsigned long arch_randomize_brk(struct mm_struct *mm)
 {
-if (mmap_is_ia32())
-return randomize_page(mm->brk, SZ_32M);
-
-return randomize_page(mm->brk, SZ_1G);
+return randomize_page(mm->brk, 0x02000000);
 }

 /*

With that revert, I can run pbuilder to compile ARM64 builds all day and it never crashes. I have no idea why
that change broke pbuilder, maybe it's something related to the way qemu is used inside the ARM64 chroot
environment, but in my opinion it's a kernel regression,

TIA,
Uli

Mit freundlichen Grüßen / Kind regards

Dipl.-Inform. Ulrich Teichert
Senior Software Developer

kumkeo GmbH
Heidenkampsweg 82a
20097 Hamburg
Germany

T: +49 40 2846761-0
F: +49 40 2846761-99

ulrich.teichert@xxxxxxxxx
www.kumkeo.de

Amtsgericht Hamburg / Hamburg District Court, HRB 108558
Geschäftsführer / Managing Director: Dipl.-Ing. Bernd Sager; Dipl.-Ing. Sven Tanneberger, MBA




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux