On Mon, 29 Jul 2024 at 14:27, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > On Mon, Jul 29, 2024 at 01:46:09PM +0200, Radoslaw Zielonek wrote: > > I am currently working on a syzbot-reported bug where bpf > > is called from trace_sched_switch. In this scenario, we are still within > > the scheduler context, and calling printk can create a deadlock. > > > > I am uncertain about the best approach to fix this issue. > > It's been like this forever, it doesn't need fixing, because tracepoints > shouldn't be doing printk() in the first place. > > > Should we simply forbid such calls, or perhaps we should replace printk > > with printk_deferred in the bpf where we are still in scheduler context? > > Not doing printk() is best. And teaching more debugging tools to behave. This particular case originates from fault injection: > [ 60.265518][ T8343] should_fail_ex+0x383/0x4d0 > [ 60.265547][ T8343] strncpy_from_user+0x36/0x2d0 > [ 60.265601][ T8343] strncpy_from_user_nofault+0x70/0x140 > [ 60.265637][ T8343] bpf_probe_read_user_str+0x2a/0x70 Probably the fail_dump() function in lib/fault-inject.c being a little too verbose in this case. Radoslaw, the fix should be in lib/fault-inject.c. Similar to other debugging tools (like KFENCE, which you discovered) adding lockdep_off()/lockdep_on(), prink_deferred, or not being as verbose in this context may be more appropriate. Fault injection does not need to print a message to inject a fault - the message is for debugging purposes. Probably a reasonable compromise is to use printk_deferred() in fail_dump() if in this context to still help with debugging on a best effort basis. You also need to take care to avoid dumping the stack in fail_dump().