On Wed, 11 Jan 2023 at 13:48, Arnd Bergmann <arnd@xxxxxxxx> wrote: > > On Wed, Jan 11, 2023, at 07:16, Naresh Kamboju wrote: > > On Tue, 10 Jan 2023 at 23:36, Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> wrote: > >> > > > > Results from Linaro’s test farm. > > Regressions on arm64 Raspberry Pi 4 Model B. > > > > Reported-by: Linux Kernel Functional Testing <lkft@xxxxxxxxxx> > > > > While running LTP controllers cgroup_fj_stress_blkio test cases > > the Insufficient stack space to handle exception! occurred and > > followed by kernel panic on arm64 Raspberry Pi 4 Model B with > > clang-15 built kernel Image. > > > > The full boot and test log attached to this email and build and > > Kconfig links provided in the bottom of this email. > > > > I will try to reproduce this reported issue and get back to you. > > I looked at the log between 6.0.18 and 6.0.19-rc1, but don't see > any arm64 or memory management patches that could result in this. > Do you know if 6.0.18 ran successful Yes, it ran successfully on 6.0.18. On the same kernel 6.0.19-rc1 built with gcc-12 did not find this panic. The reported issue is specific to clang-15 build. > > [ 2893.044339] Insufficient stack space to handle exception! > > [ 2893.044351] ESR: 0x0000000096000047 -- DABT (current EL) > > [ 2893.044360] FAR: 0xffff8000128180d0 > > [ 2893.044364] Task stack: [0xffff800012a18000..0xffff800012a1c000] > > [ 2893.044370] IRQ stack: [0xffff80000a798000..0xffff80000a79c000] > > [ 2893.044375] Overflow stack: [0xffff0000f77c4310..0xffff0000f77c5310] > ... > > [ 2893.044413] pc : el1h_64_sync+0x0/0x68 > > [ 2893.044430] lr : wp_page_copy+0xf8/0x90c > > [ 2893.044445] sp : ffff8000128180d0 > ... > > [ 2893.044692] el1h_64_sync+0x0/0x68 > > [ 2893.044700] do_wp_page+0x4a0/0x5c8 > > [ 2893.044708] handle_mm_fault+0x7fc/0x14dc > > [ 2893.044718] do_page_fault+0x29c/0x450 > > [ 2893.044727] do_mem_abort+0x4c/0xf8 > > [ 2893.044741] el0_da+0x48/0xa8 > > [ 2893.044750] el0t_64_sync_handler+0xcc/0xf0 > > [ 2893.044759] el0t_64_sync+0x18c/0x190 > > It claims that the stack overflow happened in do_wp_page(), > but that has a really short call chain. It would be good > to have the source line for do_wp_page+0x4a0/0x5c8 and > wp_page_copy+0xf8/0x90c to see where exactly it was. > > > > [ 2893.285975] WARNING: CPU: 2 PID: 315758 at kernel/sched/core.c:3119 > > set_task_cpu+0x14c/0x208 > .... > > [ 2893.286117] CPU: 2 PID: 315758 Comm: cgroup_fj_stres Not tainted > > [ 2893.286416] arch_timer_handler_phys+0x44/0x54 > > [ 2893.286427] handle_percpu_devid_irq+0x90/0x220 > > [ 2893.286439] generic_handle_domain_irq+0x38/0x50 > > [ 2893.286447] gic_handle_irq+0x68/0xe8 > > [ 2893.286455] el1_interrupt+0x88/0xc8 > > [ 2893.286464] el1h_64_irq_handler+0x18/0x24 > > [ 2893.286474] el1h_64_irq+0x64/0x68 > > [ 2893.286482] panic+0x2d8/0x374 > > This is apparently a second unrelated bug -- it still processes timer > interrupts after calling panic() and this apparently fails because > the system is already unusable. > > > artifact-location: > > https://storage.tuxsuite.com/public/linaro/lkft/builds/2K9JDtix2mHMoYRjNkBef3oR5JT > Adding " / " at end works. https://storage.tuxsuite.com/public/linaro/lkft/builds/2K9JDtix2mHMoYRjNkBef3oR5JT/ > file not found. I tried to get the vmlinux file to look at the disassembly > but the artifacts appear to be gone already. System.map: https://storage.tuxsuite.com/public/linaro/lkft/builds/2K9JDtix2mHMoYRjNkBef3oR5JT/System.map vmlinux: https://storage.tuxsuite.com/public/linaro/lkft/builds/2K9JDtix2mHMoYRjNkBef3oR5JT/vmlinux.xz Sorry for the trouble. - Naresh > > Arnd