On Thu, 20 Jun 2024 at 19:50, David Hildenbrand <david@xxxxxxxxxx> wrote: > > On 20.06.24 16:02, Naresh Kamboju wrote: > > On Thu, 20 Jun 2024 at 19:23, David Hildenbrand <david@xxxxxxxxxx> wrote: > >> > >> On 20.06.24 15:14, Naresh Kamboju wrote: > >>> On Thu, 20 Jun 2024 at 17:59, Greg Kroah-Hartman > >>> <gregkh@xxxxxxxxxxxxxxxxxxx> wrote: > >>>> > >>>> On Thu, Jun 20, 2024 at 05:21:09PM +0530, Naresh Kamboju wrote: > >>>>> On Wed, 19 Jun 2024 at 18:41, Greg Kroah-Hartman > >>>>> <gregkh@xxxxxxxxxxxxxxxxxxx> wrote: > >>>>>> > >>>>>> This is the start of the stable review cycle for the 6.9.6 release. > >>>>>> There are 281 patches in this series, all will be posted as a response > >>>>>> to this one. If anyone has any issues with these being applied, please > >>>>>> let me know. > >>>>>> > >>>>>> Responses should be made by Fri, 21 Jun 2024 12:55:11 +0000. > >>>>>> Anything received after that time might be too late. > >>>>>> > >>>>>> The whole patch series can be found in one patch at: > >>>>>> https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.9.6-rc1.gz > >>>>>> or in the git tree and branch at: > >>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.9.y > >>>>>> and the diffstat can be found below. > >>>>>> > >>>>>> thanks, > >>>>>> > >>>>>> greg k-h > >>>>> > >>>>> There are two major issues on arm64 Juno-r2 on Linux stable-rc 6.9.6-rc1 > >>>>> > >>>>> Reported-by: Linux Kernel Functional Testing <lkft@xxxxxxxxxx> > >>>>> > >>>>> 1) > >>>>> The LTP controllers cgroup_fj_stress test cases causing kernel crash > >>>>> on arm64 Juno-r2 with > >>>>> compat mode testing with stable-rc 6.9 kernel. > >>>>> > >>>>> In the recent past I have reported this issues on Linux mainline. > >>>>> > >>>>> LTP: fork13: kernel panic on rk3399-rock-pi-4 running mainline 6.10.rc3 > >>>>> - https://lore.kernel.org/all/CA+G9fYvKmr84WzTArmfaypKM9+=Aw0uXCtuUKHQKFCNMGJyOgQ@xxxxxxxxxxxxxx/ > >>>>> > >>>>> it goes like this, > >>>>> Unable to handle kernel NULL pointer dereference at virtual address > >>>>> ... > >>>>> Insufficient stack space to handle exception! > >>>>> end Kernel panic - not syncing: kernel stack overflow > >>>>> > >> > >> How is that related to 6.9.6-rc1? That report is from mainline (6.10.rc3). > >> > >> Can you share a similar kernel dmesg output from the issue on 6.9.6-rc1? > > > > I request you to use this link for detailed boot log, test log and crash log. > > - https://lkft.validation.linaro.org/scheduler/job/7687060#L23314 > > > > Few more logs related to build artifacts links provided in the original > > email thread and bottom of this email. > > > > crash log: > > --- > > Thanks for investigating this crash report. > Thanks, so this is something different than the > > "BUG: Bad page map in process fork13 > BUG: Bad rss-counter state mm:" > > stuff on mainline you referenced. > > Looks like some recursive exception until we exhausted the stack. You are right ! I see only one common case is, exhaust the stack. > > > Trying to connect the dots here, can you enlighten me how this is > related to the fork13 mainline report? I am not sure about the relation between these two reports. But as a common practice I have shared that report information. > > [ 0.000000] Booting Linux on physical CPU 0x0000000100 [0x410fd033] > > [ 0.000000] Linux version 6.9.6-rc1 (tuxmake@tuxmake) > > (aarch64-linux-gnu-gcc (Debian 13.2.0-12) 13.2.0, GNU ld (GNU Binutils > > for Debian) 2.42) #1 SMP PREEMPT @1718817000 > > ... > > [ 1786.336761] Unable to handle kernel NULL pointer dereference at > > virtual address 0000000000000070 > > [ 1786.345564] Mem abort info: > > [ 1786.348359] ESR = 0x0000000096000004 > > [ 1786.352112] EC = 0x25: DABT (current EL), IL = 32 bits > > [ 1786.357434] SET = 0, FnV = 0 > > [ 1786.360492] EA = 0, S1PTW = 0 > > [ 1786.363637] FSC = 0x04: level 0 translation fault > > [ 1786.368523] Data abort info: > > [ 1786.371405] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 > > [ 1786.376900] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 > > [ 1786.381960] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 > > [ 1786.387284] Unable to handle kernel NULL pointer dereference at > > virtual address 0000000000000070 > > [ 1786.387293] Insufficient stack space to handle exception! > > [ 1786.387296] ESR: 0x0000000096000047 -- DABT (current EL) > > [ 1786.387302] FAR: 0xffff80008399ffe0 > > [ 1786.387306] Task stack: [0xffff8000839a0000..0xffff8000839a4000] > > [ 1786.387312] IRQ stack: [0xffff8000837f8000..0xffff8000837fc000] > > [ 1786.387319] Overflow stack: [0xffff00097ec95320..0xffff00097ec96320] > > [ 1786.387327] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 6.9.6-rc1 #1 > > [ 1786.387338] Hardware name: ARM Juno development board (r2) (DT) > > [ 1786.387344] pstate: a00003c5 (NzCv DAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) > > [ 1786.387355] pc : _prb_read_valid (kernel/printk/printk_ringbuffer.c:2109) > > [ 1786.387374] lr : prb_read_valid (kernel/printk/printk_ringbuffer.c:2183) > > [ 1786.387385] sp : ffff80008399ffe0 > > [ 1786.387390] x29: ffff8000839a0030 x28: ffff000800365f00 x27: ffff800082530008 > > [ 1786.387407] x26: ffff8000834e33b8 x25: ffff8000839a00b0 x24: 0000000000000001 > > [ 1786.387423] x23: ffff8000839a00a8 x22: ffff8000830e3e40 x21: 0000000000001e9e > > [ 1786.387438] x20: 0000000000000000 x19: ffff8000839a01c8 x18: 0000000000000010 > > [ 1786.387453] x17: 72646461206c6175 x16: 7472697620746120 x15: 65636e6572656665 > > [ 1786.387468] x14: 726564207265746e x13: 3037303030303030 x12: 3030303030303030 > > [ 1786.387483] x11: 2073736572646461 x10: ffff800083151ea0 x9 : ffff80008014273c > > [ 1786.387498] x8 : ffff8000839a0120 x7 : 0000000000000000 x6 : 0000000000000e9f > > [ 1786.387512] x5 : ffff8000839a00c8 x4 : ffff8000837157c0 x3 : 0000000000000000 > > [ 1786.387526] x2 : ffff8000839a00b0 x1 : 0000000000000000 x0 : ffff8000830e3f58 > > [ 1786.387542] Kernel panic - not syncing: kernel stack overflow > > [ 1786.387549] SMP: stopping secondary CPUs > > [ 1787.510055] SMP: failed to stop secondary CPUs 0,4 > > [ 1787.510065] Kernel Offset: disabled > > [ 1787.510068] CPU features: 0x4,00001061,e0100000,0200421b > > [ 1787.510076] Memory Limit: none > > [ 1787.680436] ---[ end Kernel panic - not syncing: kernel stack overflow ]--- > > > -- > Cheers, > > David / dhildenb - Naresh