Hello! Running a simple stress-ng as a non-privileged (non root) user : stress-ng --opcode 1 --timeout 60 --metrics-brief will almost always bring the linux kernel to an unusable state, starting from "Unable to handle kernel NULL pointer dereference", "Bogus kernel PC [0000000000000000] in fault handler", "Kernel unaligned access at TPC", "Unable to handle kernel paging request at virtual address" and "rcu: INFO: rcu_sched detected stalls on CPUs/tasks"... Can you please suggest a basic linux kernel config "debug" option(s) to start with to try to debug this issue(s) and report kernel logs ? Thanks. PS: kernel logs with the last run: [ 144.386901] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 144.386954] rcu: 0-...!: (3 GPs behind) idle=574/0/0x0 softirq=427/427 fqs=1 (false positive?) [ 144.387029] (detected by 13, t=5252 jiffies, g=3145, q=224) [ 144.387161] CPU[ 0]: TSTATE[0000004480001600] TPC[000000000042c324] TNPC[000000000042c328] TASK[swapper/0:0] [ 144.387202] TPC[arch_cpu_idle+0xa4/0xc0] O7[arch_cpu_idle+0x90/0xc0] I7[default_idle_call+0xd8/0x1a0] RPC[do_idle+0x110/0x1a0] [ 144.387278] rcu: rcu_sched kthread starved for 5248 jiffies! g3145 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=23 [ 144.387317] rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior. [ 144.387349] rcu: RCU grace-period kthread stack dump: [ 144.387371] task:rcu_sched state:I stack: 0 pid: 10 ppid: 2 flags:0x07000000 [ 144.387410] Call Trace: [ 144.387425] [<0000000000bf9c4c>] schedule+0x8c/0x100 [ 144.387455] [<0000000000bfef2c>] schedule_timeout+0x16c/0x1a0 [ 144.387481] [<00000000004f2290>] rcu_gp_fqs_loop+0x1d0/0x5e0 [ 144.387508] [<00000000004f5318>] rcu_gp_kthread+0x258/0x2a0 [ 144.387534] [<000000000048c0e0>] kthread+0x100/0x120 [ 144.387558] [<00000000004060c8>] ret_from_fork+0x1c/0x2c [ 144.387591] [<0000000000000000>] 0x0 [ 165.393918] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 165.393975] rcu: 23-...0: (1 GPs behind) idle=16a/1/0x4000000000000000 softirq=1365/1366 fqs=1750 [ 165.394041] (detected by 13, t=5252 jiffies, g=3149, q=182) [ 165.394185] CPU[ 23]: TSTATE[0000000000000000] TPC[0000000000000000] TNPC[0000000000000000] TASK[NULL:-1] [ 165.394223] TPC[0] O7[0] I7[0] RPC[0] [ 205.290596] Unable to handle kernel paging request at virtual address 0000000001900000 [ 205.290657] tsk->{mm,active_mm}->context = 0000000000000b3c [ 205.290683] tsk->{mm,active_mm}->pgd = fff8000045abc000 [ 205.290707] \|/ ____ \|/ [ 205.290707] "@'/ .. \`@" [ 205.290707] /_| \__/ |_\ [ 205.290707] \__U_/ [ 205.290760] kworker/u512:3(394): Oops [#1] [ 205.290786] CPU: 20 PID: 394 Comm: kworker/u512:3 Tainted: G E 5.11.0-rc1-00073-g3516bd729358 #172 [ 205.290831] Workqueue: events_unbound flush_to_ldisc [ 205.290868] TSTATE: 0000000080e01605 TPC: 000000000049c0d4 TNPC: 000000000049c0d8 Y: 00000000 Tainted: G E [ 205.290911] TPC: <task_curr+0x34/0x60> [ 205.290943] g0: 0000000000000000 g1: 0000000001900100 g2: 00000000010fbd00 g3: 0000000001100100 [ 205.290977] g4: fff800003c27d680 g5: fff80021a9c20000 g6: fff800003df98000 g7: fff8000035e58880 [ 205.291013] o0: fff800003041ca80 o1: 00000000004743d8 o2: 0000000000000a20 o3: 0000000000000001 [ 205.291048] o4: fff800003df9b0d8 o5: dfccfb3737fbd15f sp: fff800003df9a841 ret_pc: 000000000068db94 [ 205.291083] RPC: <kmem_cache_alloc+0x3f4/0x4c0> [ 205.291114] l0: 00000000010e11c0 l1: 000000000000000e l2: 0000000000474300 l3: 0000000000de1250 [ 205.291149] l4: 0000000000e20ad0 l5: 0000000000edcc00 l6: fff8000037482a98 l7: 000000000114a888 [ 205.291184] i0: 0000000000000000 i1: 0000000000000a20 i2: fff8000040443078 i3: 0000000000edc800 [ 205.291219] i4: 0000000000edbc78 i5: fff800003041ca80 i6: fff800003df9a8f1 i7: 0000000000474c3c [ 205.291253] I7: <complete_signal+0x5c/0x2e0> [ 205.291279] Call Trace: [ 205.291295] [<0000000000474c3c>] complete_signal+0x5c/0x2e0 [ 205.291322] [<00000000004754c4>] __send_signal+0x2c4/0x4a0 [ 205.291348] [<0000000000476cd0>] send_signal+0x170/0x180 [ 205.291375] [<0000000000478a20>] group_send_sig_info+0xa0/0xe0 [ 205.291404] [<0000000000478a98>] __kill_pgrp_info+0x38/0x80 [ 205.291430] [<0000000000478b00>] kill_pgrp+0x20/0x40 [ 205.291455] [<00000000009a1490>] isig+0x70/0x140 [ 205.291485] [<00000000009a1770>] n_tty_receive_signal_char+0x10/0x80 [ 205.291514] [<00000000009a2cac>] n_tty_receive_char_special+0xcc/0x6a0 [ 205.291544] [<00000000009a3408>] n_tty_receive_buf_fast+0x188/0x220 [ 205.291573] [<00000000009a396c>] __receive_buf+0x1cc/0x300 [ 205.291600] [<00000000009a3bfc>] n_tty_receive_buf_common+0x15c/0x2e0 [ 205.291629] [<00000000009a3d9c>] n_tty_receive_buf2+0x1c/0x40 [ 205.291656] [<00000000009a6864>] tty_ldisc_receive_buf+0x24/0x80 [ 205.291686] [<00000000009a74b8>] tty_port_default_receive_buf+0x38/0x60 [ 205.291716] [<00000000009a6e30>] flush_to_ldisc+0xb0/0x120 [ 205.291741] Disabling lock debugging due to kernel taint [ 205.291750] Caller[0000000000474c3c]: complete_signal+0x5c/0x2e0 [ 205.291760] Caller[00000000004754c4]: __send_signal+0x2c4/0x4a0 [ 205.291770] Caller[0000000000476cd0]: send_signal+0x170/0x180 [ 205.291780] Caller[0000000000478a20]: group_send_sig_info+0xa0/0xe0 [ 205.291790] Caller[0000000000478a98]: __kill_pgrp_info+0x38/0x80 [ 205.291800] Caller[0000000000478b00]: kill_pgrp+0x20/0x40 [ 205.291809] Caller[00000000009a1490]: isig+0x70/0x140 [ 205.291819] Caller[00000000009a1770]: n_tty_receive_signal_char+0x10/0x80 [ 205.291829] Caller[00000000009a2cac]: n_tty_receive_char_special+0xcc/0x6a0 [ 205.291840] Caller[00000000009a3408]: n_tty_receive_buf_fast+0x188/0x220 [ 205.291850] Caller[00000000009a396c]: __receive_buf+0x1cc/0x300 [ 205.291860] Caller[00000000009a3bfc]: n_tty_receive_buf_common+0x15c/0x2e0 [ 205.291871] Caller[00000000009a3d9c]: n_tty_receive_buf2+0x1c/0x40 [ 205.291881] Caller[00000000009a6864]: tty_ldisc_receive_buf+0x24/0x80 [ 205.291891] Caller[00000000009a74b8]: tty_port_default_receive_buf+0x38/0x60 [ 205.291901] Caller[00000000009a6e30]: flush_to_ldisc+0xb0/0x120 [ 205.291910] Caller[0000000000484eb0]: process_one_work+0x390/0x620 [ 205.291925] Caller[00000000004853f0]: worker_thread+0x2b0/0x4a0 [ 205.291935] Caller[000000000048c0e0]: kthread+0x100/0x120 [ 205.291946] Caller[00000000004060c8]: ret_from_fork+0x1c/0x2c [ 205.291962] Caller[0000000000000000]: 0x0 [ 205.291971] Instruction DUMP: [ 205.291974] 83306010 [ 205.291980] 83287008 [ 205.291987] 8200c001 [ 205.291992] <c25860f8> [ 205.291999] 84004002 [ 205.292005] c258aa28 [ 205.292011] 8e19c001 [ 205.292017] 81cfe008 [ 205.292023] 9179e401 [ 205.292029] [ 228.411239] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 228.411273] rcu: 23-...0: (1 GPs behind) idle=16a/1/0x4000000000000000 softirq=1365/1366 fqs=7002 [ 228.411312] (detected by 9, t=21007 jiffies, g=3149, q=224) [ 228.411439] CPU[ 23]: TSTATE[0000000000000000] TPC[0000000000000000] TNPC[0000000000000000] TASK[NULL:-1] [ 228.411451] TPC[0] O7[0] I7[0] RPC[0] [ 291.429047] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 291.429084] rcu: 23-...0: (1 GPs behind) idle=16a/1/0x4000000000000000 softirq=1365/1366 fqs=12254 [ 291.429129] (detected by 9, t=36762 jiffies, g=3149, q=265) [ 291.429256] CPU[ 23]: TSTATE[0000000000000000] TPC[0000000000000000] TNPC[0000000000000000] TASK[NULL:-1] [ 291.429269] TPC[0] O7[0] I7[0] RPC[0] [ 312.540280] watchdog: BUG: soft lockup - CPU#16 stuck for 23s! [systemd-logind:652] [ 312.540318] Modules linked in: tun(E) binfmt_misc(E) camellia_sparc64(E) des_sparc64(E) aes_sparc64(E) n2_rng(E) md5_sparc64(E) rng_core(E) flash(E) sha512_sparc64(E) sha256_sparc64(E) sha1_sparc64(E) nft_ct(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) nft_counter(E) nf_tables(E) nfnetlink(E) sunrpc(E) fuse(E) configfs(E) ip_tables(E) x_tables(E) autofs4(E) raid10(E) raid456(E) async_raid6_recov(E) async_memcpy(E) async_pq(E) raid6_pq(E) async_xor(E) xor(E) async_tx(E) raid1(E) raid0(E) multipath(E) linear(E) md_mod(E) dm_mod(E) crc32c_sparc64(E) sunvnet(E) sunvnet_common(E) [ 312.540485] CPU: 16 PID: 652 Comm: systemd-logind Tainted: G D W E 5.11.0-rc1-00073-g3516bd729358 #172 [ 312.540501] TSTATE: 0000000011001601 TPC: 00000000005183ec TNPC: 00000000005183f0 Y: 00000006 Tainted: G D W E [ 312.540514] TPC: <smp_call_function_many_cond+0x38c/0x3e0> [ 312.540529] g0: 000000000069952c g1: 0000000000000011 g2: fff80021aade3420 g3: 0000000000000017 [ 312.540540] g4: fff8000044eb6180 g5: fff80021a9b20000 g6: fff8000044f7c000 g7: 0000000000c00000 [ 312.540551] o0: 0000000000000017 o1: fff80021aac1ccc8 o2: fff80021aaddcc80 o3: fff80021aac1cce8 [ 312.540562] o4: 0000000000000000 o5: fff80021aac1cce8 sp: fff8000044f7e931 ret_pc: 0000000000518408 [ 312.540572] RPC: <smp_call_function_many_cond+0x3a8/0x3e0> [ 312.540582] l0: 0000000000000100 l1: fff80021aac1ccc8 l2: 0000000000edbe64 l3: 0000000001100100 [ 312.540592] l4: 0000000001103420 l5: fff80021aac1cce8 l6: 0000000000000017 l7: 0000000000000100 [ 312.540602] i0: fff80021aac1ccc0 i1: 000000000043f000 i2: fff8000044f7f358 i3: 0000000000000001 [ 312.540612] i4: 0000000000000000 i5: 0000000001100100 i6: fff8000044f7e9f1 i7: 000000000051845c [ 312.540622] I7: <smp_call_function_many+0x1c/0x40> [ 312.540632] Call Trace: [ 312.540639] [<000000000051845c>] smp_call_function_many+0x1c/0x40 [ 312.540649] [<00000000004402d4>] smp_flush_tlb_pending+0x34/0x60 [ 312.540662] [<000000000044ed14>] flush_tlb_pending+0x74/0xa0 [ 312.540676] [<000000000044eea4>] arch_leave_lazy_mmu_mode+0x24/0x40 [ 312.540687] [<000000000064a81c>] zap_pte_range+0x59c/0x6c0 [ 312.540698] [<000000000064aac8>] unmap_page_range+0x188/0x240 [ 312.540708] [<000000000064ac40>] unmap_single_vma+0xc0/0xe0 [ 312.540717] [<000000000064ad7c>] unmap_vmas+0x1c/0x60 [ 312.540726] [<00000000006543cc>] exit_mmap+0x14c/0x1e0 [ 312.540742] [<000000000045fcf8>] mmput+0x58/0x120 [ 312.540754] [<0000000000469cf8>] exit_mm+0x1b8/0x200 [ 312.540766] [<000000000046a1f4>] do_exit+0x4b4/0x500 [ 312.540776] [<000000000046a364>] do_group_exit+0xa4/0xc0 [ 312.540786] [<00000000004786c0>] get_signal+0x600/0x680 [ 312.540800] [<000000000042d52c>] do_signal+0x4c/0x1c0 [ 312.540817] [<000000000042de70>] do_notify_resume+0x30/0x80 [ 354.446709] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 354.446747] rcu: 23-...0: (1 GPs behind) idle=16a/1/0x4000000000000000 softirq=1365/1366 fqs=17506 [ 354.446792] (detected by 13, t=52517 jiffies, g=3149, q=288) [ 354.446920] CPU[ 23]: TSTATE[0000000000000000] TPC[0000000000000000] TNPC[0000000000000000] TASK[NULL:-1] [ 354.446933] TPC[0] O7[0] I7[0] RPC[0] [ 417.464277] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 417.464314] rcu: 23-...0: (1 GPs behind) idle=16a/1/0x4000000000000000 softirq=1365/1366 fqs=22758 [ 417.464359] (detected by 13, t=68272 jiffies, g=3149, q=288) [ 417.464487] CPU[ 23]: TSTATE[0000000000000000] TPC[0000000000000000] TNPC[0000000000000000] TASK[NULL:-1] [ 417.464500] TPC[0] O7[0] I7[0] RPC[0] [ 480.481809] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 480.481847] rcu: 23-...0: (1 GPs behind) idle=16a/1/0x4000000000000000 softirq=1365/1366 fqs=28010 [ 480.481890] (detected by 13, t=84027 jiffies, g=3149, q=288) [ 480.482018] CPU[ 23]: TSTATE[0000000000000000] TPC[0000000000000000] TNPC[0000000000000000] TASK[NULL:-1] [ 480.482031] TPC[0] O7[0] I7[0] RPC[0] [ 543.499328] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 543.499366] rcu: 23-...0: (1 GPs behind) idle=16a/1/0x4000000000000000 softirq=1365/1366 fqs=33262 [ 543.499411] (detected by 13, t=99782 jiffies, g=3149, q=288) [ 543.499538] CPU[ 23]: TSTATE[0000000000000000] TPC[0000000000000000] TNPC[0000000000000000] TASK[NULL:-1] [ 543.499550] TPC[0] O7[0] I7[0] RPC[0] [ 606.516843] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 606.516881] rcu: 23-...0: (1 GPs behind) idle=16a/1/0x4000000000000000 softirq=1365/1366 fqs=38514 [ 606.516924] (detected by 13, t=115537 jiffies, g=3149, q=288) [ 606.517052] CPU[ 23]: TSTATE[0000000000000000] TPC[0000000000000000] TNPC[0000000000000000] TASK[NULL:-1] [ 606.517065] TPC[0] O7[0] I7[0] RPC[0] [ 624.568155] BUG: workqueue lockup - pool cpus=2 node=0 flags=0x0 nice=0 stuck for 306s! [ 624.568322] Showing busy workqueues and worker pools: [ 624.568374] workqueue events: flags=0x0 [ 624.568639] pwq 4: cpus=2 node=0 flags=0x0 nice=0 active=1/256 refcnt=2 [ 624.568663] pending: psi_avgs_work [ 624.568685] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 refcnt=2 [ 624.568707] pending: vmstat_shepherd [ 624.568813] workqueue events_unbound: flags=0x2 [ 624.568822] pwq 512: cpus=0-255 flags=0x4 nice=0 active=2/1024 refcnt=4 [ 624.568850] in-flight: 394:flush_to_ldisc flush_to_ldisc [ 624.569131] workqueue mm_percpu_wq: flags=0x8 [ 624.569356] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 refcnt=2 [ 624.569377] pending: vmstat_update [ 624.573421] pool 512: cpus=0-255 flags=0x4 nice=0 hung=302s workers=5 idle: 395 166 7 281 [ 669.534354] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 669.534393] rcu: 23-...0: (1 GPs behind) idle=16a/1/0x4000000000000000 softirq=1365/1366 fqs=43766 [ 669.534439] (detected by 16, t=131292 jiffies, g=3149, q=292) [ 669.534568] CPU[ 23]: TSTATE[0000000000000000] TPC[0000000000000000] TNPC[0000000000000000] TASK[NULL:-1] [ 669.534581] TPC[0] O7[0] I7[0] RPC[0] [ 696.525288] watchdog: BUG: soft lockup - CPU#16 stuck for 22s! [systemd-logind:652] [ 696.525330] Modules linked in: tun(E) binfmt_misc(E) camellia_sparc64(E) des_sparc64(E) aes_sparc64(E) n2_rng(E) md5_sparc64(E) rng_core(E) flash(E) sha512_sparc64(E) sha256_sparc64(E) sha1_sparc64(E) nft_ct(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) nft_counter(E) nf_tables(E) nfnetlink(E) sunrpc(E) fuse(E) configfs(E) ip_tables(E) x_tables(E) autofs4(E) raid10(E) raid456(E) async_raid6_recov(E) async_memcpy(E) async_pq(E) raid6_pq(E) async_xor(E) xor(E) async_tx(E) raid1(E) raid0(E) multipath(E) linear(E) md_mod(E) dm_mod(E) crc32c_sparc64(E) sunvnet(E) sunvnet_common(E) [ 696.525498] CPU: 16 PID: 652 Comm: systemd-logind Tainted: G D W EL 5.11.0-rc1-00073-g3516bd729358 #172 [ 696.525514] TSTATE: 0000000011001601 TPC: 00000000005183dc TNPC: 00000000005183e0 Y: 00000006 Tainted: G D W EL [ 696.525526] TPC: <smp_call_function_many_cond+0x37c/0x3e0> [ 696.525542] g0: 000000000069952c g1: 0000000000000011 g2: fff80021aade3420 g3: 0000000000000017 [ 696.525553] g4: fff8000044eb6180 g5: fff80021a9b20000 g6: fff8000044f7c000 g7: 0000000000c00000 [ 696.525563] o0: 0000000000000017 o1: fff80021aac1ccc8 o2: fff80021aaddcc80 o3: fff80021aac1cce8 [ 696.525574] o4: 0000000000000000 o5: fff80021aac1cce8 sp: fff8000044f7e931 ret_pc: 0000000000518408 [ 696.525584] RPC: <smp_call_function_many_cond+0x3a8/0x3e0> [ 696.525594] l0: 0000000000000100 l1: fff80021aac1ccc8 l2: 0000000000edbe64 l3: 0000000001100100 [ 696.525604] l4: 0000000001103420 l5: fff80021aac1cce8 l6: 0000000000000017 l7: 0000000000000100 [ 696.525615] i0: fff80021aac1ccc0 i1: 000000000043f000 i2: fff8000044f7f358 i3: 0000000000000001 [ 696.525625] i4: 0000000000000000 i5: 0000000001100100 i6: fff8000044f7e9f1 i7: 000000000051845c [ 696.525635] I7: <smp_call_function_many+0x1c/0x40> [ 696.525645] Call Trace: [ 696.525652] [<000000000051845c>] smp_call_function_many+0x1c/0x40 [ 696.525663] [<00000000004402d4>] smp_flush_tlb_pending+0x34/0x60 [ 696.525675] [<000000000044ed14>] flush_tlb_pending+0x74/0xa0 [ 696.525689] [<000000000044eea4>] arch_leave_lazy_mmu_mode+0x24/0x40 [ 696.525700] [<000000000064a81c>] zap_pte_range+0x59c/0x6c0 [ 696.525712] [<000000000064aac8>] unmap_page_range+0x188/0x240 [ 696.525722] [<000000000064ac40>] unmap_single_vma+0xc0/0xe0 [ 696.525731] [<000000000064ad7c>] unmap_vmas+0x1c/0x60 [ 696.525741] [<00000000006543cc>] exit_mmap+0x14c/0x1e0 [ 696.525756] [<000000000045fcf8>] mmput+0x58/0x120 [ 696.525769] [<0000000000469cf8>] exit_mm+0x1b8/0x200 [ 696.525781] [<000000000046a1f4>] do_exit+0x4b4/0x500 [ 696.525791] [<000000000046a364>] do_group_exit+0xa4/0xc0 [ 696.525801] [<00000000004786c0>] get_signal+0x600/0x680 [ 696.525815] [<000000000042d52c>] do_signal+0x4c/0x1c0 [ 696.525832] [<000000000042de70>] do_notify_resume+0x30/0x80