On Mon, Mar 20, 2023 at 12:55 AM Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Sun, Mar 19, 2023 at 03:36:57PM +0800, Yafang Shao wrote: > > On Sat, Mar 18, 2023 at 12:52 AM Alexei Starovoitov > > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > > > On Fri, Mar 17, 2023 at 4:49 AM Yafang Shao <laoar.shao@xxxxxxxxx> wrote: > > > > > > > > It hits below warning on my test machine when running test_progs, > > > > > > > > [ 702.223611] ------------[ cut here ]------------ > > > > [ 702.224168] RCU not on for: preempt_count_sub+0x0/0xa0 > > > > [ 702.224770] WARNING: CPU: 14 PID: 5267 at include/linux/trace_recursion.h:162 fprobe_handler.part.0+0x1b8/0x1c0 > > > > [ 702.231740] CPU: 14 PID: 5267 Comm: main_amd64 Kdump: loaded Tainted: G O 6.2.0+ #584 > > > > [ 702.233169] RIP: 0010:fprobe_handler.part.0+0x1b8/0x1c0 > > > > [ 702.241388] Call Trace: > > > > [ 702.241615] <TASK> > > > > [ 702.241811] fprobe_handler+0x22/0x30 > > > > [ 702.242129] 0xffffffffc04710f7 > > > > [ 702.242417] RIP: 0010:preempt_count_sub+0x5/0xa0 > > > > [ 702.242809] Code: c8 50 68 94 42 0e b5 48 cf e9 f9 fd ff ff 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 e8 4b cd 38 0b <55> 8b 0d 9c d0 cf 02 48 89 e5 85 c9 75 1b 65 8b 05 be 78 f4 4a 89 > > > > [ 702.244752] RSP: 0018:ffffaf6187d27f10 EFLAGS: 00000082 ORIG_RAX: 0000000000000000 > > > > [ 702.245801] RAX: 000000000000000e RBX: 0000000001b6ab72 RCX: 0000000000000000 > > > > [ 702.246804] RDX: 0000000000000000 RSI: ffffffffb627967d RDI: 0000000000000001 > > > > [ 702.247801] RBP: ffffaf6187d27f30 R08: 0000000000000000 R09: 0000000000000000 > > > > [ 702.248786] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000000000ca > > > > [ 702.249782] R13: ffffaf6187d27f58 R14: 0000000000000000 R15: 0000000000000000 > > > > [ 702.250785] ? preempt_count_sub+0x5/0xa0 > > > > [ 702.251540] ? syscall_enter_from_user_mode+0x96/0xc0 > > > > [ 702.252368] ? preempt_count_sub+0x5/0xa0 > > > > [ 702.253104] ? syscall_enter_from_user_mode+0x96/0xc0 > > > > [ 702.253918] do_syscall_64+0x16/0x90 > > > > [ 702.254613] entry_SYSCALL_64_after_hwframe+0x72/0xdc > > > > [ 702.255422] RIP: 0033:0x46b793 > > > > > > > > It's caused by bench test attaching kprobe_multi link to preempt_count_sub > > > > function, which is not executed in rcu safe context so the kprobe handler > > > > on top of it will trigger the rcu warning. > > > > > > Why is that? > > > > It is caused by CONFIG_CONTEXT_TRACKING_USER=y, and it seems the > > preempt_count_sub is executed before the RCU is watching. > > user_exit_irqoff > > if (context_tracking_enabled()) // CONFIG_CONTEXT_TRACKING_USER=y > > __ct_user_exit(CONTEXT_USER); > > ct_kernel_enter > > ... > > // RCU is not watching here ... > > ct_kernel_enter_state(offset); > > // ... but is watching here. > > > > It can be reproduced with a simple bpf code as follows when > > CONFIG_CONTEXT_TRACKING_USER=y, > > SEC("kprobe.multi/preempt_count_sub") > > int kprobe_multi_trace() > > { > > return 0; > > } > > > > > preempt_count itself is fine. > > > The problem is elsewhere. > > > Since !rcu_is_watching() it some sort of idle or some other issue. > > > > Not sure if we need to improve the code under > > CONFIG_CONTEXT_TRACKING_USER=y, but it seems skipping "preempt_count_" > > in kprobe_multi test case would be a quick fix. > > It's not a fix. Only moving a goal post. > We probably need > if (!rcu_is_watching()) > return; > in [kf]probe handler instead. Good suggestion. I will think about it. BTW, we can't kprobe preempt_count_sub, because it is a nokprobe symbol. NOKPROBE_SYMBOL(preempt_count_sub); -- Regards Yafang