On Tue, Jun 25, 2024 at 4:52 PM Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote: > > On 2024/06/26 4:32, Alexei Starovoitov wrote: > >>>>> On 2024-06-25, Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote: > >>>>>> syzbot is reporting circular locking dependency inside __bpf_prog_run(), > >>>>>> for fault injection calls printk() despite rq lock is already held. > > > > If you want to add printk_deferred_enter() it > > probably should be in should_fail_ex(). Not here. > > We will not be wrapping all bpf progs this way. > > should_fail_ex() is just an instance. > Three months ago you said "bpf never calls printk()" at > https://lkml.kernel.org/r/CAADnVQLmLMt2bF9aAB26dtBCvy2oUFt+AAKDRgTTrc7Xk_zxJQ@xxxxxxxxxxxxxx , > but bpf does indirectly call printk() due to debug functionality. > > We will be able to stop wrapping with printk_deferred_enter() after the printk > rework completes ( https://lkml.kernel.org/r/ZXBCB2Gv1O-1-T6f@alley ). But we > can't predict how long we need to wait for all console drivers to get converted. > > Until the printk rework completes, it is responsibility of the caller to guard > whatever possible printk() with rq lock already held. If you think that only > individual function that may call printk() (e.g. should_fail_ex()) should be > wrapped, just saying "We will not be wrapping all bpf progs this way" does not > help, for we need to scatter migrate_{disable,enable}() overhead as well as > printk_deferred_{enter,exit}() to individual function despite majority of callers > do not call e.g. should_fail_ex() with rq lock already held. Only who needs to > call e.g. should_fail_ex() with rq lock already held should pay the cost. In this > case, the one who should pay the cost is tracing hooks that are called with rq > lock already held. I don't think that it is reasonable to add overhead to all > users because tracing hooks might not be enabled or bpf program might not call > e.g. should_fail_ex(). > > If you have a method that we can predict whether e.g. should_fail_ex() is called, > you can wrap only bpf progs that call e.g. should_fail_ex(). But it is your role > to maintain list of functions that might trigger printk(). I think that you don't > want such burden (as well as all users don't want burden/overhead of adding > migrate_{disable,enable}() only for the sake of bpf subsystem). You are missing the point. The bug has nothing to do with bpf. It can happen without any bpf loaded. Exactly the same way. should_fail_usercopy() is called on all user accesses.