On Wed, Jun 14, 2023 at 10:34:30AM +0200, Sebastian Andrzej Siewior wrote: > bpf_free_inode() is invoked as a RCU callback. Usually RCU callbacks are > invoked within softirq context. By setting rcutree.use_softirq=0 boot > option the RCU callbacks will be invoked in a per-CPU kthread with > bottom halves disabled which implies a RCU read section. > > On PREEMPT_RT the context remains fully preemptible. The RCU read > section however does not allow schedule() invocation. The latter happens > in mutex_lock() performed by bpf_trampoline_unlink_prog() originated > from bpf_link_put(). Just to make sure that I understand, you are proposing that the RCU callbacks continue to run with BH disabled, but that BH-disabled regions are preemptible in kernels built with CONFIG_PREEMPT_RT=y? Or did I miss a turn in there somewhere? Thanx, Paul > It was pointed out that the bpf_link_put() invocation should not be > delayed if originated from close(). It was also pointed out that other > invocations from within a syscall should also avoid the workqueue. > Everyone else should use workqueue by default to remain safe in the > future (while auditing the code, every caller was preemptible except for > the RCU case). > > Let bpf_link_put() use the worker unconditionally. Add > bpf_link_put_direct() which will directly free the resources and is used > by close() and from within __sys_bpf(). > > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> > --- > v3…v4: > - Revert back to bpf_link_put_direct() to the direct free and let > bpf_link_put() use the worker. Let close() and all invocations from > within the syscall use bpf_link_put_direct() which are all instances > within syscall.c here. > > v2…v3: > - Drop bpf_link_put_direct(). Let bpf_link_put() do the direct free > and add bpf_link_put_from_atomic() to do the delayed free via the > worker. > > v1…v2: > - Add bpf_link_put_direct() to be used from bpf_link_release() as > suggested. > > kernel/bpf/syscall.c | 29 ++++++++++++++++------------- > 1 file changed, 16 insertions(+), 13 deletions(-) > > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c > index 14f39c1e573ee..8f09aef5949d4 100644 > --- a/kernel/bpf/syscall.c > +++ b/kernel/bpf/syscall.c > @@ -2777,28 +2777,31 @@ static void bpf_link_put_deferred(struct work_struct *work) > bpf_link_free(link); > } > > -/* bpf_link_put can be called from atomic context, but ensures that resources > - * are freed from process context > +/* bpf_link_put might be called from atomic context. It needs to be called > + * from sleepable context in order to acquire sleeping locks during the process. > */ > void bpf_link_put(struct bpf_link *link) > { > if (!atomic64_dec_and_test(&link->refcnt)) > return; > > - if (in_atomic()) { > - INIT_WORK(&link->work, bpf_link_put_deferred); > - schedule_work(&link->work); > - } else { > - bpf_link_free(link); > - } > + INIT_WORK(&link->work, bpf_link_put_deferred); > + schedule_work(&link->work); > } > EXPORT_SYMBOL(bpf_link_put); > > +static void bpf_link_put_direct(struct bpf_link *link) > +{ > + if (!atomic64_dec_and_test(&link->refcnt)) > + return; > + bpf_link_free(link); > +} > + > static int bpf_link_release(struct inode *inode, struct file *filp) > { > struct bpf_link *link = filp->private_data; > > - bpf_link_put(link); > + bpf_link_put_direct(link); > return 0; > } > > @@ -4764,7 +4767,7 @@ static int link_update(union bpf_attr *attr) > if (ret) > bpf_prog_put(new_prog); > out_put_link: > - bpf_link_put(link); > + bpf_link_put_direct(link); > return ret; > } > > @@ -4787,7 +4790,7 @@ static int link_detach(union bpf_attr *attr) > else > ret = -EOPNOTSUPP; > > - bpf_link_put(link); > + bpf_link_put_direct(link); > return ret; > } > > @@ -4857,7 +4860,7 @@ static int bpf_link_get_fd_by_id(const union bpf_attr *attr) > > fd = bpf_link_new_fd(link); > if (fd < 0) > - bpf_link_put(link); > + bpf_link_put_direct(link); > > return fd; > } > @@ -4934,7 +4937,7 @@ static int bpf_iter_create(union bpf_attr *attr) > return PTR_ERR(link); > > err = bpf_iter_new_fd(link); > - bpf_link_put(link); > + bpf_link_put_direct(link); > > return err; > } > -- > 2.40.1 >