Re: [RFC PATCH bpf-next 05/16] bpf: create file or anonymous dumpers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Apr 8, 2020 at 4:26 PM Yonghong Song <yhs@xxxxxx> wrote:
>
> Given a loaded dumper bpf program, which already
> knows which target it should bind to, there
> two ways to create a dumper:
>   - a file based dumper under hierarchy of
>     /sys/kernel/bpfdump/ which uses can
>     "cat" to print out the output.
>   - an anonymous dumper which user application
>     can "read" the dumping output.
>
> For file based dumper, BPF_OBJ_PIN syscall interface
> is used. For anonymous dumper, BPF_PROG_ATTACH
> syscall interface is used.

We discussed this offline with Yonghong a bit, but I thought I'd put
my thoughts about this in writing for completeness. To me, it seems
like the most consistent way to do both anonymous and named dumpers is
through the following steps:

1. BPF_PROG_LOAD to load/verify program, that created program FD.
2. LINK_CREATE using that program FD and direntry FD. This creates
dumper bpf_link (bpf_dumper_link), returns anonymous link FD. If link
FD is closed, dumper program is detached and dumper is destroyed
(unless pinned in bpffs, just like with any other bpf_link.
3. At this point bpf_dumper_link can be treated like a factory of
seq_files. We can add a new BPF_DUMPER_OPEN_FILE (all names are for
illustration purposes) command, that accepts dumper link FD and
returns a new seq_file FD, which can be read() normally (or, e.g.,
cat'ed from shell).
4. Additionally, this anonymous bpf_link can be pinned/mounted in
bpfdumpfs. We can do it as BPF_OBJ_PIN or as a separate command. Once
pinned at, e.g., /sys/fs/bpfdump/task/my_dumper, just opening that
file is equivalent to BPF_DUMPER_OPEN_FILE and will create a new
seq_file that can be read() independently from other seq_files opened
against the same dumper. Pinning bpfdumpfs entry also bumps refcnt of
bpf_link itself, so even if process that created link dies, bpf dumper
stays attached until its bpfdumpfs entry is deleted.

Apart from BPF_DUMPER_OPEN_FILE and open()'ing bpfdumpfs file duality,
it seems pretty consistent and follows safe-by-default auto-cleanup of
anonymous link, unless pinned in bpfdumpfs (or one can still pin
bpf_link in bpffs, but it can't be open()'ed the same way, it just
preserves BPF program from being cleaned up).

Out of all schemes I could come up with, this one seems most unified
and nicely fits into bpf_link infra. Thoughts?

>
> To facilitate target seq_ops->show() to get the
> bpf program easily, dumper creation increased
> the target-provided seq_file private data size
> so bpf program pointer is also stored in seq_file
> private data.
>
> Further, a seq_num which represents how many
> bpf_dump_get_prog() has been called is also
> available to the target seq_ops->show().
> Such information can be used to e.g., print
> banner before printing out actual data.
>
> Note the seq_num does not represent the num
> of unique kernel objects the bpf program has
> seen. But it should be a good approximate.
>
> A target feature BPF_DUMP_SEQ_NET_PRIVATE
> is implemented specifically useful for
> net based dumpers. It sets net namespace
> as the current process net namespace.
> This avoids changing existing net seq_ops
> in order to retrieve net namespace from
> the seq_file pointer.
>
> For open dumper files, anonymous or not, the
> fdinfo will show the target and prog_id associated
> with that file descriptor. For dumper file itself,
> a kernel interface will be provided to retrieve the
> prog_id in one of the later patches.
>
> Signed-off-by: Yonghong Song <yhs@xxxxxx>
> ---
>  include/linux/bpf.h            |   5 +
>  include/uapi/linux/bpf.h       |   6 +-
>  kernel/bpf/dump.c              | 338 ++++++++++++++++++++++++++++++++-
>  kernel/bpf/syscall.c           |  11 +-
>  tools/include/uapi/linux/bpf.h |   6 +-
>  5 files changed, 362 insertions(+), 4 deletions(-)
>

[...]



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux