On Mon, Jun 5, 2023 at 9:37 AM Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> wrote: > > On 2023-05-31 12:00:56 [-0700], Andrii Nakryiko wrote: > > > On 2023-05-25 10:51:23 [-0700], Andrii Nakryiko wrote: > > > v2…v3: > > > - Drop bpf_link_put_direct(). Let bpf_link_put() do the direct free > > > and add bpf_link_put_from_atomic() to do the delayed free via the > > > worker. > > > > This seems like an unsafe "default put" choice. I think it makes more > > sense to have bpf_link_put() do a safe scheduled put, and then having > > a separate bpf_link_put_direct() for those special cases where we care > > the most (in kernel/bpf/inode.c and kernel/bpf/syscall.c). > > I audited them and ended up with them all being safe except for the one > from inode.c. I can reverse the logic if you want. I understand it's safe today, but I'm more worried for future places that will do bpf_link_put(). Given it's only close() and BPF FS unlink() that require synchronous semantics, I think it's fine to make bpf_link_put() to be async, and have bpf_link_put_sync() (or direct, or whatever suffix) as a conscious special case. > > > > diff --git a/kernel/bpf/inode.c b/kernel/bpf/inode.c > > > index 9948b542a470e..2e1e9f3c7f701 100644 > > > --- a/kernel/bpf/inode.c > > > +++ b/kernel/bpf/inode.c > > > @@ -59,7 +59,10 @@ static void bpf_any_put(void *raw, enum bpf_type type) > > > bpf_map_put_with_uref(raw); > > > break; > > > case BPF_TYPE_LINK: > > > - bpf_link_put(raw); > > > + if (may_sleep) > > > + bpf_link_put(raw); > > > + else > > > + bpf_link_put_from_atomic(raw); > > > > Do we need to do this in two different ways here? The only situation > > that has may_sleep=false is when called from superblock->free_inode. > > According to documentation: > > > > Freeing memory in the callback is fine; doing > > more than that is possible, but requires a lot of care and is best > > avoided. > > > > So it feels like cleaning up link should be safe to do from this > > context as well? I've cc'ed linux-fsdevel@, hopefully they can advise. > > This is invoked from the RCU callback (which is usually softirq): > destroy_inode() > -> call_rcu(&inode->i_rcu, i_callback); > > Freeing memory is fine but there is a mutex that is held in the process. I think it should be fine for BPF link destruction then? > > Sebastian