On Wed, Apr 12, 2023 at 7:52 PM Yafang Shao <laoar.shao@xxxxxxxxx> wrote: > > From: Yafang <laoar.shao@xxxxxxxxx> > > The recursion check in __bpf_prog_enter* and __bpf_prog_exit* > leave preempt_count_{sub,add} unprotected. When attaching trampoline to > them we get panic as follows, > > [ 867.843050] BUG: TASK stack guard page was hit at 0000000009d325cf (stack is 0000000046a46a15..00000000537e7b28) > [ 867.843064] stack guard page: 0000 [#1] PREEMPT SMP NOPTI > [ 867.843067] CPU: 8 PID: 11009 Comm: trace Kdump: loaded Not tainted 6.2.0+ #4 > [ 867.843100] Call Trace: > [ 867.843101] <TASK> > [ 867.843104] asm_exc_int3+0x3a/0x40 > [ 867.843108] RIP: 0010:preempt_count_sub+0x1/0xa0 > [ 867.843135] __bpf_prog_enter_recur+0x17/0x90 > [ 867.843148] bpf_trampoline_6442468108_0+0x2e/0x1000 > [ 867.843154] ? preempt_count_sub+0x1/0xa0 > [ 867.843157] preempt_count_sub+0x5/0xa0 > [ 867.843159] ? migrate_enable+0xac/0xf0 > [ 867.843164] __bpf_prog_exit_recur+0x2d/0x40 > [ 867.843168] bpf_trampoline_6442468108_0+0x55/0x1000 > ... > [ 867.843788] preempt_count_sub+0x5/0xa0 > [ 867.843793] ? migrate_enable+0xac/0xf0 > [ 867.843829] __bpf_prog_exit_recur+0x2d/0x40 > [ 867.843837] BUG: IRQ stack guard page was hit at 0000000099bd8228 (stack is 00000000b23e2bc4..000000006d95af35) > [ 867.843841] BUG: IRQ stack guard page was hit at 000000005ae07924 (stack is 00000000ffd69623..0000000014eb594c) > [ 867.843843] BUG: IRQ stack guard page was hit at 00000000028320f0 (stack is 00000000034b6438..0000000078d1bcec) > [ 867.843842] bpf_trampoline_6442468108_0+0x55/0x1000 > ... > > That is because in __bpf_prog_exit_recur, the preempt_count_{sub,add} are > called after prog->active is decreased. > > Fixing this by adding these two functions into btf ids deny list. > > Suggested-by: Steven Rostedt <rostedt@xxxxxxxxxxx> > Signed-off-by: Yafang <laoar.shao@xxxxxxxxx> > Cc: Masami Hiramatsu <mhiramat@xxxxxxxxxx> > Cc: Steven Rostedt <rostedt@xxxxxxxxxxx> > Cc: Jiri Olsa <olsajiri@xxxxxxxxx> > --- Thanks Yafang, Acked-by: Hao Luo <haoluo@xxxxxxxxxx> I happened to be looking at a similar problem the other day. I was wondering if we can trace preempt_{enable, disable}. It turns out those functions are not covered by the recursion protection. It makes sense to add them to the denylist. Hao