Re: [PATCH bpf-next v2 03/20] bpf: support bpf tracing/iter programs for BPF_LINK_CREATE

Alexei Starovoitov <ast@xxxxxx> · Tue, 5 May 2020 17:54:05 -0700

On 5/5/20 5:14 PM, Yonghong Song wrote:

On 5/5/20 2:30 PM, Andrii Nakryiko wrote:
On Sun, May 3, 2020 at 11:26 PM Yonghong Song <yhs@xxxxxx> wrote:

Given a bpf program, the step to create an anonymous bpf iterator is:
   - create a bpf_iter_link, which combines bpf program and the target.
     In the future, there could be more information recorded in the 
link.
     A link_fd will be returned to the user space.
   - create an anonymous bpf iterator with the given link_fd.

The bpf_iter_link can be pinned to bpffs mount file system to
create a file based bpf iterator as well.

The benefit to use of bpf_iter_link:
   - using bpf link simplifies design and implementation as bpf link
     is used for other tracing bpf programs.
   - for file based bpf iterator, bpf_iter_link provides a standard
     way to replace underlying bpf programs.
   - for both anonymous and free based iterators, bpf link query
     capability can be leveraged.

The patch added support of tracing/iter programs for BPF_LINK_CREATE.
A new link type BPF_LINK_TYPE_ITER is added to facilitate link
querying. Currently, only prog_id is needed, so there is no
additional in-kernel show_fdinfo() and fill_link_info() hook
is needed for BPF_LINK_TYPE_ITER link.

Signed-off-by: Yonghong Song <yhs@xxxxxx>
---

LGTM. See small nit about __GFP_NOWARN.

Acked-by: Andrii Nakryiko <andriin@xxxxxx>

  include/linux/bpf.h            |  1 +
  include/linux/bpf_types.h      |  1 +
  include/uapi/linux/bpf.h       |  1 +
  kernel/bpf/bpf_iter.c          | 62 ++++++++++++++++++++++++++++++++++
  kernel/bpf/syscall.c           | 14 ++++++++
  tools/include/uapi/linux/bpf.h |  1 +
  6 files changed, 80 insertions(+)

[...]

+int bpf_iter_link_attach(const union bpf_attr *attr, struct bpf_prog 
*prog)
+{
+       struct bpf_link_primer link_primer;
+       struct bpf_iter_target_info *tinfo;
+       struct bpf_iter_link *link;
+       bool existed = false;
+       u32 prog_btf_id;
+       int err;
+
+       if (attr->link_create.target_fd || attr->link_create.flags)
+               return -EINVAL;
+
+       prog_btf_id = prog->aux->attach_btf_id;
+       mutex_lock(&targets_mutex);
+       list_for_each_entry(tinfo, &targets, list) {
+               if (tinfo->btf_id == prog_btf_id) {
+                       existed = true;
+                       break;
+               }
+       }
+       mutex_unlock(&targets_mutex);
+       if (!existed)
+               return -ENOENT;
+
+       link = kzalloc(sizeof(*link), GFP_USER | __GFP_NOWARN);

nit: all existing link implementation don't specify __GFP_NOWARN,
wonder if bpf_iter_link should be special?

Nothing special. Just feel __GFP_NOWARN is the right thing to do to 
avoid pollute dmesg since -ENOMEM is returned to user space. But in
reality, unlike some key/value allocation where the size could be huge
and __GFP_NOWARN might be more useful, here, sizeof(*link) is fixed
and small, __GFP_NOWARN probably not that useful.

Will drop it.

actually all existing user space driven allocation have nowarn.
If we missed it in other link allocs we should probably add it.