On Tue, May 31, 2022 at 10:36 AM Yegor Yefremov <yegorslists@xxxxxxxxxxxxxx> wrote: > > On Mon, May 30, 2022 at 5:15 PM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: > > > > On Mon, 30 May 2022 at 15:54, Arnd Bergmann <arnd@xxxxxxxx> wrote: > > > > > > On Sat, May 28, 2022 at 9:28 PM Yegor Yefremov > > > <yegorslists@xxxxxxxxxxxxxx> wrote: > > > > > > > > On Sat, May 28, 2022 at 3:14 PM Arnd Bergmann <arnd@xxxxxxxx> wrote: > > > > > > > > > > On Sat, May 28, 2022 at 3:01 PM Yegor Yefremov > > > > > <yegorslists@xxxxxxxxxxxxxx> wrote: > > > > > > On Sat, May 28, 2022 at 11:07 AM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: > > > > > > In file included from ./include/linux/irqflags.h:17, > > > > > > from ./arch/arm/include/asm/bitops.h:28, > > > > > > from ./include/linux/bitops.h:33, > > > > > > from ./include/linux/log2.h:12, > > > > > > from kernel/bounds.c:13: > > > > > > ./arch/arm/include/asm/percpu.h: In function ‘__my_cpu_offset’: > > > > > > ./arch/arm/include/asm/percpu.h:32:9: error: ‘__per_cpu_offset’ > > > > > > undeclared (first use in this function); did you mean > > > > > > ‘__my_cpu_offset’? > > > > > > 32 | return __per_cpu_offset[0]; > > > > > > | ^~~~~~~~~~~~~~~~ > > > > > > | __my_cpu_offset > > > > > > ./arch/arm/include/asm/percpu.h:32:9: note: each undeclared identifier > > > > > > is reported only once for each function it appears in > > > > > > > > > > I think you just missed the line in my patch that adds the > > > > > "extern unsigned long __per_cpu_offset[];" variable declaration. > > > > > > > > So, I tried both variants and both led to stalls. > > > > > > I'm running out of ideas here. Going to back to the original bisection, > > > I rebased Ard's patches in a way that you should be able to build the > > > config for each patch, and I split up the "ARM: implement > > > THREAD_INFO_IN_TASK for uniprocessor systems" commit in yet > > > another way, hoping to get something left over that points to the > > > bug. Can you try bisecting through the top commits of > > > > > > https://kernel.org/pub/scm/linux/kernel/git/soc/soc.git am335x-stall-test > > > > > > starting maybe with "52d240871760 irqchip: nvic: Use > > > GENERIC_IRQ_MULTI_HANDLER" as the patch that is almost certainly > > > going to be ok? > > > > > > At some point I fear we may have to give up and just mark the v6+SMP > > > configuration as broken, which is something we have considered in the > > > past but ended up always keeping around for the purpose of testing > > > omap2plus_defconfig and imx_v6_v7_defconfig. Note that on production > > > systems you probably don't want to use that config anway, and should > > > either stick to a uniprocessor build, or disable the ARMv6 support. > > > > > > > Yeah, I am also running out of ideas. One question, though: does the > > RCU detected stall always occur in the same place? I.e., how similar > > are the backtraces of the stalls between different occurrences? > > Perhaps we could narrow down where in the code we are stalling, and > > gain some more understanding of the root cause. > > I have attached 4 crash logs and will start with Arnd's branch bisecting. My bisect results: git bisect log git bisect start # good: [52d24087176055d5994ac98378426421b2d6d653] irqchip: nvic: Use GENERIC_IRQ_MULTI_HANDLER git bisect good 52d24087176055d5994ac98378426421b2d6d653 # bad: [2d3456213319c0277ee6082946c43c3afacca9b4] [PART 2] ARM: implement THREAD_INFO_IN_TASK for uniprocessor system git bisect bad 2d3456213319c0277ee6082946c43c3afacca9b4 # good: [20e50fc1187d82d6d9ef80c01cf8e11d476f6227] ARM: 9176/1: avoid literal references in inline assembly git bisect good 20e50fc1187d82d6d9ef80c01cf8e11d476f6227 # good: [59f3cd822afe6445b2864d0cf1a73ca6edd24f42] ARM: smp: defer TPIDRURO update for SMP v6 configurations too git bisect good 59f3cd822afe6445b2864d0cf1a73ca6edd24f42 # bad: [b6b3b4814e77d2f5a7517297e9ac1d1aa1cda103] [PART 1] ARM: implement THREAD_INFO_IN_TASK for uniprocessor systems git bisect bad b6b3b4814e77d2f5a7517297e9ac1d1aa1cda103 # good: [dccfc18999cf4b4e518f01d5c7c578426166e5f2] ARM: v7m: enable support for IRQ stacks git bisect good dccfc18999cf4b4e518f01d5c7c578426166e5f2 # first bad commit: [b6b3b4814e77d2f5a7517297e9ac1d1aa1cda103] [PART 1] ARM: implement THREAD_INFO_IN_TASK for uniprocessor systems Though commit b6b3b4814e77d2f5a7517297e9ac1d1aa1cda103 led to a broken kernel that didn't even show any output after the bootloader had started it. Commit 2d3456213319c0277ee6082946c43c3afacca9b4 showed the expected stalling. Yegor