On Mon, Mar 2, 2020 at 1:40 PM Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote: > > Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> writes: > > > On Mon, Mar 2, 2020 at 2:13 AM Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote: > >> > >> Andrii Nakryiko <andriin@xxxxxx> writes: > >> > >> > Introduce bpf_link abstraction, representing an attachment of BPF program to > >> > a BPF hook point (e.g., tracepoint, perf event, etc). bpf_link encapsulates > >> > ownership of attached BPF program, reference counting of a link itself, when > >> > reference from multiple anonymous inodes, as well as ensures that release > >> > callback will be called from a process context, so that users can safely take > >> > mutex locks and sleep. > >> > > >> > Additionally, with a new abstraction it's now possible to generalize pinning > >> > of a link object in BPF FS, allowing to explicitly prevent BPF program > >> > detachment on process exit by pinning it in a BPF FS and let it open from > >> > independent other process to keep working with it. > >> > > >> > Convert two existing bpf_link-like objects (raw tracepoint and tracing BPF > >> > program attachments) into utilizing bpf_link framework, making them pinnable > >> > in BPF FS. More FD-based bpf_links will be added in follow up patches. > >> > > >> > Signed-off-by: Andrii Nakryiko <andriin@xxxxxx> > >> > --- > >> > include/linux/bpf.h | 13 +++ > >> > kernel/bpf/inode.c | 42 ++++++++- > >> > kernel/bpf/syscall.c | 209 ++++++++++++++++++++++++++++++++++++------- > >> > 3 files changed, 226 insertions(+), 38 deletions(-) > >> > [...] > >> > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c > >> > index c536c65256ad..fca8de7e7872 100644 > >> > --- a/kernel/bpf/syscall.c > >> > +++ b/kernel/bpf/syscall.c > >> > @@ -2173,23 +2173,153 @@ static int bpf_obj_get(const union bpf_attr *attr) > >> > attr->file_flags); > >> > } > >> > > >> > -static int bpf_tracing_prog_release(struct inode *inode, struct file *filp) > >> > +struct bpf_link { > >> > + atomic64_t refcnt; > >> > >> refcount_t ? > > > > Both bpf_map and bpf_prog stick to atomic64 for their refcounting, so > > I'd like to stay consistent and use refcount that can't possible leak > > resources (which refcount_t can, if it's overflown). > > refcount_t is specifically supposed to turn a possible use-after-free on > under/overflow into a warning, isn't it? Not going to insist or anything > here, just found it odd that you'd prefer the other... Well, underflow is a huge bug that should never happen in well-tested code (at least that's assumption for bpf_map and bpf_prog), and we are generally very careful about that. Overflow can happen only because refcount_t is using 32-bit integer, which atomic64_t side-steps completely by going to 64-bit integer. So yeah, I'd rather stick to the same stuff that's used for bpf_map and bpf_prog. > > -Toke >