On Wed, 27 Nov 2024 18:27:57 -0800 Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> > On Wed, Nov 27, 2024 at 3:04 PM Hillf Danton <hdanton@xxxxxxxx> wrote: > > On Tue, 26 Nov 2024 13:15:48 -0800 Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> > > > On Mon, Nov 25, 2024 at 1:44 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > > > On Mon, Nov 25, 2024 at 05:24:05AM +0000, Ruan Bonan wrote: > > > > > > > > > From the discussion, it appears that the root cause might involve > > > > > specific printk or BPF operations in the given context. To clarify and > > > > > possibly avoid similar issues in the future, are there guidelines or > > > > > best practices for writing BPF programs/hooks that interact with > > > > > tracepoints, especially those related to scheduler events, to prevent > > > > > such deadlocks? > > > > > > > > The general guideline and recommendation for all tracepoints is to be > > > > wait-free. Typically all tracer code should be. > > > > > > > > Now, BPF (users) (ab)uses tracepoints to do all sorts and takes certain > > > > liberties with them, but it is very much at the discretion of the BPF > > > > user. > > > > > > We do assume that tracepoints are just like kprobes and can run in > > > NMI. And in this case BPF is just a vehicle to trigger a > > > promised-to-be-wait-free strncpy_from_user_nofault(). That's as far as > > > BPF involvement goes, we should stop discussing BPF in this context, > > > it's misleading. > > > > > Given known issue, syzbot should run without bpf enabled before it is fixed > > to avoid more useless discussing and misleading. > > If you cared to read the thread it would have been obvious > that printk is the culprit. Tell syzbot to run without printk? > Printk is innocent, and it makes no sense to put the gun vendor into jail simply because bpf shoot a sheriff in the cafeteira.