Running the strace test suite on my SPARC (UltraSPARC IIIi) results in a kernel oops and hang: Unable to handle kernel NULL pointer dereference tsk->{mm,active_mm}->context = 00000000000012d5 tsk->{mm,active_mm}->pgd = fff000023c8a4000 \|/ ____ \|/ "@'/ .. \`@" /_| \__/ |_\ \__U_/ trace_creds.gen(31718): Oops [#1] CPU: 1 PID: 31718 Comm: trace_creds.gen Not tainted 5.9.0-rc1 #1 TSTATE: 0000009980001606 TPC: 00000000004bf044 TNPC: 00000000004bf048 Y: 00000000 Not tainted TPC: <bpf_prog_free+0x14/0x48> g0: 0000000000000001 g1: 0000000fffffffe0 g2: fff00012382ad2e8 g3: 00000000000001d8 g4: fff000023c239d00 g5: fff000123eee8000 g6: fff000023dc24000 g7: 000c000024769f88 o0: 0000000000000002 o1: fff000123fe71878 o2: 0000000000000000 o3: fff000123f80c0b0 o4: 000c00002476f788 o5: 0000000000000000 sp: fff000123fe674f1 ret_pc: 000000000070ee60 RPC: <sk_filter_release_rcu+0x8/0x18> l0: 00000000000093a4 l1: 00000000008df400 l2: 000000000000000b l3: 0000000000123400 l4: 000000000016d1c0 l5: 0000000000162e00 l6: 0000000000000000 l7: 00000000f77a2000 i0: fff000123a00d960 i1: 0000000000000000 i2: fff000123afe0d00 i3: 0000000000024265 i4: fff000123afe0d00 i5: 0000000000000000 i6: fff000123fe675a1 i7: 0000000000499c98 I7: <rcu_core+0x2dc/0x498> Call Trace: [<0000000000499c98>] rcu_core+0x2dc/0x498 [<0000000000786144>] __do_softirq+0x1b4/0x200 [<000000000042b988>] do_softirq_own_stack+0x2c/0x40 [<0000000000459d60>] __irq_exit_rcu+0x58/0xa8 [<0000000000459f7c>] irq_exit+0x4/0x14 [<0000000000785e6c>] timer_interrupt+0x88/0xb0 [<00000000004209d4>] tl0_irq14+0x14/0x20 [<00000000007859b0>] _raw_spin_unlock_irqrestore+0x8/0x10 [<00000000004c4d80>] filemap_map_pages+0x194/0x1cc [<00000000004e4598>] handle_mm_fault+0x520/0x7ac [<0000000000448854>] do_sparc64_fault+0x408/0x624 [<0000000000407714>] sparc64_realfault_common+0x10/0x20 Disabling lock debugging due to kernel taint Caller[0000000000499c98]: rcu_core+0x2dc/0x498 Caller[0000000000786144]: __do_softirq+0x1b4/0x200 Caller[000000000042b988]: do_softirq_own_stack+0x2c/0x40 Caller[0000000000459d60]: __irq_exit_rcu+0x58/0xa8 Caller[0000000000459f7c]: irq_exit+0x4/0x14 Caller[0000000000785e6c]: timer_interrupt+0x88/0xb0 Caller[00000000004209d4]: tl0_irq14+0x14/0x20 Caller[00000000004e2be8]: alloc_set_pte+0x12c/0x1c0 Caller[00000000004c4d80]: filemap_map_pages+0x194/0x1cc Caller[00000000004e4598]: handle_mm_fault+0x520/0x7ac Caller[0000000000448854]: do_sparc64_fault+0x408/0x624 Caller[0000000000407714]: sparc64_realfault_common+0x10/0x20 Caller[00000000f7638d64]: 0xf7638d64 Instruction DUMP: 90102002 83287024 82187fe0 <c272a210> 8202a218 c272a218 9402a210 c272a010 03001307 Kernel panic - not syncing: Aiee, killing interrupt handler! There are some variations on the oops. bpf_prog_free is usually involved, but the call stack leading to it varies. Sometimes it identifies a user-mode process, sometimes it does not. Once the oops occurred in kfree instead. Initial checks showed that the problems started with the 5.2 kernel. The visible oops started with [8e41f8726dcf423621e2b6938d015b9796f6f676] mm/vmalloc: Fix calculation of direct map addr range. Before that, general kernel instability (hangs and occasional RCU errors, but no oopses) appears to have started with [d53d2f78ceadba081fc7785570798c3c8d50a718] bpf: Use vmalloc special flag. Recent versions of strace trigger the oopses, strace-5.3 and 5.8 were tested. Both 32 and 64-bit builds of strace trigger the oopses. The exact point in the test suite where the oops occurs seems to vary, but running the entire test suite reliably triggers it. Kernels compiled by both gcc-8.4 and 9.3 oops. I haven't seen this reported anywhere, so perhaps it's only triggered on older platforms like the Ultras. My kernel's .config is attached. /Mikael
Attachment:
config
Description: Binary data