Re: INFO: rcu detected stall in sys_keyctl

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 4, 2020 at 9:36 PM Kris Karas <linux-1993@xxxxxxxxxxxxxxxx> wrote:
>
> Dmitry Vyukov wrote:
> > Kris Karas wrote:
> >> [...]
> >>       rcu: INFO: rcu_sched self-detected stall on CPU
> >>       rcu:    14-....: (20999 ticks this GP)
> >> idle=216/1/0x4000000000000002 softirq=454/454 fqs=5250
> >>               (t=21004 jiffies g=-755 q=1327)
> >>       NMI backtrace for cpu 14
> >>       CPU: 14 PID: 520 Comm: pidof Tainted: G      D           5.5.7 #1
> >> [...]
> >> I don't have a reproducer for it, either.  It showed up in 5.5.7 (but
> >> might be from earlier as it reproduces so infrequently).
> > Hi Kris,
> >
> > What follows after this stack? That's the most interesting part. The
> > part that you showed is common for all stalls and does not mean much,
> > besides the fact that there is a stall. These can well be very
> > different stalls in different parts of kernel.
>
> Hi Dmitry,
>
> Sorry, dummy me, I should have found my original post in Lore and posted
> a link to that.
> Oh, here we go:
>
> https://lore.kernel.org/lkml/6d4f9ac8-a478-2ae4-0fe3-5d074d267148@xxxxxxxxxxxxxxxx/

These all mention ptrace_may_access, so that looks like a different bug to me.

> Given that the stall stack is not terribly useful, it would seem that
> the OOPS I saw was probably unrelated to this one caught by syzbot,
> though the stalled CPU does make me curious (as in all the OOPSen I've
> encountered in the past 25 years have rarely mentioned an RCU stall).

Well, it means you don't stress the kernel hard enough :)
Search for "rcu detected stall" here:
https://syzkaller.appspot.com/upstream
https://syzkaller.appspot.com/upstream/fixed
https://syzkaller.appspot.com/linux-4.19
https://syzkaller.appspot.com/linux-4.14
and these are all different _bugs_, some of them have tens of
thousands of crash instances. And that's just in 2.5 years :)


> For convenience, I'll re-post everything I was able to salvage from
> dmesg originally.
>
> Kris
>
> The OOPS in the dump, below, occurred while the machine was booting,
> right about the time that /sbin/init switched from runstate S => 3.
> System daemons (haveged, named, syslogd, etc...) were starting. The OOPS
> occurred in /bin/pidof, which is no doubt checking whether a daemon is
> up before attempting to start it.  Under the OOPS, the filesystem was
> functioning (at least well enough to save dmesg to a file), though many
> things were hanging.  It required an Alt-SysRq-E to get me a login
> prompt, and I lucked out in PAM and friends working well enough to give
> me a functioning command prompt. Alt-SysRq-S,U,S,B was necessary to
> reboot.  Without further ado...
>
> BUG: kernel NULL pointer dereference, address: 00000000000000e8
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 7f6a50067 P4D 7f6a50067 PUD 7f6a51067 PMD 0
> Oops: 0000 [#1] SMP
> CPU: 3 PID: 516 Comm: pidof Not tainted 5.5.7 #1
> Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X470 Taichi, BIOS P3.50 07/18/2019
> RIP: 0010:cap_capable+0x13/0x70
> Code: bf f4 ff ff ff 66 90 e9 01 ff ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 4c 8b 87 88 00 00 00 4c 39 c6 74 39 45 8b 88 e8 00 00 00 <44> 39 8e e8 00 00 00 7e 18 48 8b 86 e0 00 00 00 4c 39 c0 74 12 48
> RSP: 0018:ffffc90000777cb0 EFLAGS: 00010207
> RAX: ffff8887f96ea000 RBX: 0000000000000002 RCX: 0000000000000002
> RDX: 0000000000000013 RSI: 0000000000000000 RDI: ffff8887f9646480
> RBP: 0000000000000013 R08: ffffffff82423da0 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> R13: ffff8887f9646480 R14: ffffffff822a7620 R15: ffff8887fae600c0
> FS:  00007f8ee26cd740(0000) GS:ffff8887fecc0000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00000000000000e8 CR3: 00000007f8691000 CR4: 00000000003406e0
> Call Trace:
>   security_capable+0x36/0x50
>   ptrace_has_cap+0x14/0x30
>   __ptrace_may_access+0x76/0x110
>   ptrace_may_access+0x28/0x50
>   do_task_stat+0x7b/0xd90
>   ? do_filp_open+0xab/0x100
>   proc_single_show+0x54/0xc0
>   ? __kmalloc+0x183/0x210
>   seq_read+0xbb/0x3c0
>   vfs_read+0xc6/0x150
>   ksys_read+0x6b/0x100
>   do_syscall_64+0x3d/0x120
>   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x7f8ee27d682e
> Code: c0 e9 f6 fe ff ff 50 48 8d 3d b6 5d 0a 00 e8 e9 fd 01 00 66 0f 1f 84 00 00 00 00 00 64 8b 04 25 18 00 00 00 85 c0 75 14 0f 05 <48> 3d 00 f0 ff ff 77 5a c3 66 0f 1f 84 00 00 00 00 00 48 83 ec 28
> RSP: 002b:00007ffdc7fdcf58 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
> RAX: ffffffffffffffda RBX: 00007f8ee28ce958 RCX: 00007f8ee27d682e
> RDX: 0000000000000400 RSI: 00000000017e2590 RDI: 0000000000000004
> RBP: 00007f8ee28ce950 R08: 00007f8ee28ac120 R09: 00007ffdc7fdce00
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004
> R13: 0000000000000000 R14: 0000000000000000 R15: 00007ffdc7fddf02
> Modules linked in:
> CR2: 00000000000000e8
> ---[ end trace 9da0e81512fbb929 ]---
> RIP: 0010:cap_capable+0x13/0x70
> Code: bf f4 ff ff ff 66 90 e9 01 ff ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 4c 8b 87 88 00 00 00 4c 39 c6 74 39 45 8b 88 e8 00 00 00 <44> 39 8e e8 00 00 00 7e 18 48 8b 86 e0 00 00 00 4c 39 c0 74 12 48
> RSP: 0018:ffffc90000777cb0 EFLAGS: 00010207
> RAX: ffff8887f96ea000 RBX: 0000000000000002 RCX: 0000000000000002
> RDX: 0000000000000013 RSI: 0000000000000000 RDI: ffff8887f9646480
> RBP: 0000000000000013 R08: ffffffff82423da0 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> R13: ffff8887f9646480 R14: ffffffff822a7620 R15: ffff8887fae600c0
> FS:  00007f8ee26cd740(0000) GS:ffff8887fecc0000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00000000000000e8 CR3: 00000007f8691000 CR4: 00000000003406e0
> udevd[518]: starting eudev-3.2.9
> rcu: INFO: rcu_sched self-detected stall on CPU
> rcu:    14-....: (20999 ticks this GP) idle=216/1/0x4000000000000002 softirq=454/454 fqs=5250
>          (t=21004 jiffies g=-755 q=1327)
> NMI backtrace for cpu 14
> CPU: 14 PID: 520 Comm: pidof Tainted: G      D           5.5.7 #1
> Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X470 Taichi, BIOS P3.50 07/18/2019
> Call Trace:
>   <IRQ>
>   dump_stack+0x50/0x70
>   nmi_cpu_backtrace.cold+0x14/0x53
>   ? lapic_can_unplug_cpu.cold+0x44/0x44
>   nmi_trigger_cpumask_backtrace+0x7b/0x88
>   rcu_dump_cpu_stacks+0x7b/0xa9
>   rcu_sched_clock_irq.cold+0x152/0x39b
>   update_process_times+0x1f/0x50
>   tick_sched_timer+0x40/0x90
>   ? tick_sched_do_timer+0x50/0x50
>   __hrtimer_run_queues+0xdd/0x180
>   hrtimer_interrupt+0x108/0x230
>   smp_apic_timer_interrupt+0x53/0xa0
>   apic_timer_interrupt+0xf/0x20
>   </IRQ>
> RIP: 0010:queued_spin_lock_slowpath+0x41/0x1a0
> Code: 2f 08 0f 92 c0 0f b6 c0 c1 e0 08 89 c2 8b 07 30 e4 09 d0 a9 00 01 ff ff 75 18 85 c0 74 0e 8b 07 84 c0 74 08 f3 90 8b 07 84 c0 <75> f8 66 c7 07 01 00 c3 f6 c4 01 75 04 c6 47 01 00 48 c7 c0 40 29
> RSP: 0018:ffffc90001e87d08 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13
> RAX: 0000000000000101 RBX: ffff8887f96f0000 RCX: ffff8887f96f0000
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8887f96f0658
> RBP: 000000000000000d R08: 0000000000000001 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: ffff8887f50b7080
> R13: ffffffff82424480 R14: ffffffff82424480 R15: ffff8887f50b70c0
>   ptrace_may_access+0x1e/0x50
>   do_task_stat+0x7b/0xd90
>   ? do_filp_open+0xab/0x100
>   proc_single_show+0x54/0xc0
>   ? __kmalloc+0x183/0x210
>   seq_read+0xbb/0x3c0
>   vfs_read+0xc6/0x150
>   ksys_read+0x6b/0x100
>   do_syscall_64+0x3d/0x120
>   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x7f666796082e
> Code: c0 e9 f6 fe ff ff 50 48 8d 3d b6 5d 0a 00 e8 e9 fd 01 00 66 0f 1f 84 00 00 00 00 00 64 8b 04 25 18 00 00 00 85 c0 75 14 0f 05 <48> 3d 00 f0 ff ff 77 5a c3 66 0f 1f 84 00 00 00 00 00 48 83 ec 28
> RSP: 002b:00007ffe26688b38 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
> RAX: ffffffffffffffda RBX: 00007f6667a58958 RCX: 00007f666796082e
> RDX: 0000000000000400 RSI: 000000000153e590 RDI: 0000000000000004
> RBP: 00007f6667a58950 R08: 00007f6667a36120 R09: 00007ffe266889e0
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004
> R13: 0000000000000000 R14: 0000000000000000 R15: 00007ffe2668aee9
> ...
> sysrq: Terminate All Tasks
> rcu: INFO: rcu_sched self-detected stall on CPU
> rcu:    14-....: (83876 ticks this GP) idle=216/1/0x4000000000000002 softirq=454/454 fqs=20970
>          (t=84003 jiffies g=-755 q=2695)
> NMI backtrace for cpu 14
> CPU: 14 PID: 520 Comm: pidof Tainted: G      D           5.5.7 #1
> Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X470 Taichi, BIOS P3.50 07/18/2019
> Call Trace:
>   <IRQ>
>   dump_stack+0x50/0x70
>   nmi_cpu_backtrace.cold+0x14/0x53
>   ? lapic_can_unplug_cpu.cold+0x44/0x44
>   nmi_trigger_cpumask_backtrace+0x7b/0x88
>   rcu_dump_cpu_stacks+0x7b/0xa9
>   rcu_sched_clock_irq.cold+0x152/0x39b
>   update_process_times+0x1f/0x50
>   tick_sched_timer+0x40/0x90
>   ? tick_sched_do_timer+0x50/0x50
>   __hrtimer_run_queues+0xdd/0x180
>   hrtimer_interrupt+0x108/0x230
>   smp_apic_timer_interrupt+0x53/0xa0
>   apic_timer_interrupt+0xf/0x20
>   </IRQ>
> RIP: 0010:queued_spin_lock_slowpath+0x3d/0x1a0
> Code: 3e f0 0f ba 2f 08 0f 92 c0 0f b6 c0 c1 e0 08 89 c2 8b 07 30 e4 09 d0 a9 00 01 ff ff 75 18 85 c0 74 0e 8b 07 84 c0 74 08 f3 90 <8b> 07 84 c0 75 f8 66 c7 07 01 00 c3 f6 c4 01 75 04 c6 47 01 00 48
> RSP: 0018:ffffc90001e87d08 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13
> RAX: 0000000000000101 RBX: ffff8887f96f0000 RCX: ffff8887f96f0000
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8887f96f0658
> RBP: 000000000000000d R08: 0000000000000001 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: ffff8887f50b7080
> R13: ffffffff82424480 R14: ffffffff82424480 R15: ffff8887f50b70c0
>   ptrace_may_access+0x1e/0x50
>   do_task_stat+0x7b/0xd90
>   ? do_filp_open+0xab/0x100
>   proc_single_show+0x54/0xc0
>   ? __kmalloc+0x183/0x210
>   seq_read+0xbb/0x3c0
>   vfs_read+0xc6/0x150
>   ksys_read+0x6b/0x100
>   do_syscall_64+0x3d/0x120
>   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x7f666796082e
> Code: c0 e9 f6 fe ff ff 50 48 8d 3d b6 5d 0a 00 e8 e9 fd 01 00 66 0f 1f 84 00 00 00 00 00 64 8b 04 25 18 00 00 00 85 c0 75 14 0f 05 <48> 3d 00 f0 ff ff 77 5a c3 66 0f 1f 84 00 00 00 00 00 48 83 ec 28
> RSP: 002b:00007ffe26688b38 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
> RAX: ffffffffffffffda RBX: 00007f6667a58958 RCX: 00007f666796082e
> RDX: 0000000000000400 RSI: 000000000153e590 RDI: 0000000000000004
> RBP: 00007f6667a58950 R08: 00007f6667a36120 R09: 00007ffe266889e0
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004
> R13: 0000000000000000 R14: 0000000000000000 R15: 00007ffe2668aee9
> rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 14-... } 21041 jiffies s: 45 root: 0x4000/.
> rcu: blocking rcu_node structures:
> Task dump for CPU 14:
> pidof           R  running task        0   520      1 0x8000000c
> Call Trace:
>   ? do_syscall_64+0x3d/0x120
>   ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
> rcu: INFO: rcu_sched self-detected stall on CPU
> rcu:    14-....: (146878 ticks this GP) idle=216/1/0x4000000000000002 softirq=454/454 fqs=36715
>          (t=147006 jiffies g=-755 q=3376)
> NMI backtrace for cpu 14
> CPU: 14 PID: 520 Comm: pidof Tainted: G      D           5.5.7 #1
> Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X470 Taichi, BIOS P3.50 07/18/2019
> Call Trace:
>   <IRQ>
>   dump_stack+0x50/0x70
>   nmi_cpu_backtrace.cold+0x14/0x53
>   ? lapic_can_unplug_cpu.cold+0x44/0x44
>   nmi_trigger_cpumask_backtrace+0x7b/0x88
>   rcu_dump_cpu_stacks+0x7b/0xa9
>   rcu_sched_clock_irq.cold+0x152/0x39b
>   update_process_times+0x1f/0x50
>   tick_sched_timer+0x40/0x90
>   ? tick_sched_do_timer+0x50/0x50
>   __hrtimer_run_queues+0xdd/0x180
>   hrtimer_interrupt+0x108/0x230
>   smp_apic_timer_interrupt+0x53/0xa0
>   apic_timer_interrupt+0xf/0x20
>   </IRQ>
> RIP: 0010:queued_spin_lock_slowpath+0x3d/0x1a0
> Code: 3e f0 0f ba 2f 08 0f 92 c0 0f b6 c0 c1 e0 08 89 c2 8b 07 30 e4 09 d0 a9 00 01 ff ff 75 18 85 c0 74 0e 8b 07 84 c0 74 08 f3 90 <8b> 07 84 c0 75 f8 66 c7 07 01 00 c3 f6 c4 01 75 04 c6 47 01 00 48
> RSP: 0018:ffffc90001e87d08 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13
> RAX: 0000000000000101 RBX: ffff8887f96f0000 RCX: ffff8887f96f0000
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8887f96f0658
> RBP: 000000000000000d R08: 0000000000000001 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: ffff8887f50b7080
> R13: ffffffff82424480 R14: ffffffff82424480 R15: ffff8887f50b70c0
>   ptrace_may_access+0x1e/0x50
>   do_task_stat+0x7b/0xd90
>   ? do_filp_open+0xab/0x100
>   proc_single_show+0x54/0xc0
>   ? __kmalloc+0x183/0x210
>   seq_read+0xbb/0x3c0
>   vfs_read+0xc6/0x150
>   ksys_read+0x6b/0x100
>   do_syscall_64+0x3d/0x120
>   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x7f666796082e
> Code: c0 e9 f6 fe ff ff 50 48 8d 3d b6 5d 0a 00 e8 e9 fd 01 00 66 0f 1f 84 00 00 00 00 00 64 8b 04 25 18 00 00 00 85 c0 75 14 0f 05 <48> 3d 00 f0 ff ff 77 5a c3 66 0f 1f 84 00 00 00 00 00 48 83 ec 28
> RSP: 002b:00007ffe26688b38 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
> RAX: ffffffffffffffda RBX: 00007f6667a58958 RCX: 00007f666796082e
> RDX: 0000000000000400 RSI: 000000000153e590 RDI: 0000000000000004
> RBP: 00007f6667a58950 R08: 00007f6667a36120 R09: 00007ffe266889e0
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004
> R13: 0000000000000000 R14: 0000000000000000 R15: 00007ffe2668aee9
> rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 14-... } 85041 jiffies s: 45 root: 0x4000/.
> rcu: blocking rcu_node structures:
> Task dump for CPU 14:
> pidof           R  running task        0   520      1 0x8000000c
> Call Trace:
>   ? do_syscall_64+0x3d/0x120
>   ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
> rcu: INFO: rcu_sched self-detected stall on CPU
> rcu:    14-....: (209792 ticks this GP) idle=216/1/0x4000000000000002 softirq=454/454 fqs=52439
>          (t=210009 jiffies g=-755 q=3747)
> NMI backtrace for cpu 14
> CPU: 14 PID: 520 Comm: pidof Tainted: G      D           5.5.7 #1
> Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X470 Taichi, BIOS P3.50 07/18/2019
> Call Trace:
>   <IRQ>
>   dump_stack+0x50/0x70
>   nmi_cpu_backtrace.cold+0x14/0x53
>   ? lapic_can_unplug_cpu.cold+0x44/0x44
>   nmi_trigger_cpumask_backtrace+0x7b/0x88
>   rcu_dump_cpu_stacks+0x7b/0xa9
>   rcu_sched_clock_irq.cold+0x152/0x39b
>   update_process_times+0x1f/0x50
>   tick_sched_timer+0x40/0x90
>   ? tick_sched_do_timer+0x50/0x50
>   __hrtimer_run_queues+0xdd/0x180
>   hrtimer_interrupt+0x108/0x230
>   smp_apic_timer_interrupt+0x53/0xa0
>   apic_timer_interrupt+0xf/0x20
>   </IRQ>
> RIP: 0010:queued_spin_lock_slowpath+0x3d/0x1a0
> Code: 3e f0 0f ba 2f 08 0f 92 c0 0f b6 c0 c1 e0 08 89 c2 8b 07 30 e4 09 d0 a9 00 01 ff ff 75 18 85 c0 74 0e 8b 07 84 c0 74 08 f3 90 <8b> 07 84 c0 75 f8 66 c7 07 01 00 c3 f6 c4 01 75 04 c6 47 01 00 48
> RSP: 0018:ffffc90001e87d08 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13
> RAX: 0000000000000101 RBX: ffff8887f96f0000 RCX: ffff8887f96f0000
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8887f96f0658
> RBP: 000000000000000d R08: 0000000000000001 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: ffff8887f50b7080
> R13: ffffffff82424480 R14: ffffffff82424480 R15: ffff8887f50b70c0
>   ptrace_may_access+0x1e/0x50
>   do_task_stat+0x7b/0xd90
>   ? do_filp_open+0xab/0x100
>   proc_single_show+0x54/0xc0
>   ? __kmalloc+0x183/0x210
>   seq_read+0xbb/0x3c0
>   vfs_read+0xc6/0x150
>   ksys_read+0x6b/0x100
>   do_syscall_64+0x3d/0x120
>   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x7f666796082e
> Code: c0 e9 f6 fe ff ff 50 48 8d 3d b6 5d 0a 00 e8 e9 fd 01 00 66 0f 1f 84 00 00 00 00 00 64 8b 04 25 18 00 00 00 85 c0 75 14 0f 05 <48> 3d 00 f0 ff ff 77 5a c3 66 0f 1f 84 00 00 00 00 00 48 83 ec 28
> RSP: 002b:00007ffe26688b38 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
> RAX: ffffffffffffffda RBX: 00007f6667a58958 RCX: 00007f666796082e
> RDX: 0000000000000400 RSI: 000000000153e590 RDI: 0000000000000004
> RBP: 00007f6667a58950 R08: 00007f6667a36120 R09: 00007ffe266889e0
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004
> R13: 0000000000000000 R14: 0000000000000000 R15: 00007ffe2668aee9
> rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 14-... } 150577 jiffies s: 45 root: 0x4000/.
> rcu: blocking rcu_node structures:
> Task dump for CPU 14:
> pidof           R  running task        0   520      1 0x8000000c
> Call Trace:
>   ? do_syscall_64+0x3d/0x120
>   ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
>
>



[Index of Archives]     [Kernel]     [Gnu Classpath]     [Gnu Crypto]     [DM Crypt]     [Netfilter]     [Bugtraq]

  Powered by Linux