BPF logging infrastructure. Was: [PATCH bpf-next 4/6] tools: add new members to bpf_attr.raw_tracepoint in bpf.h

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jul 13, 2020 at 1:13 PM Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote:
>
> From: Toke Høiland-Jørgensen <toke@xxxxxxxxxx>
>
> Sync addition of new members from main kernel tree.
>
> Signed-off-by: Toke Høiland-Jørgensen <toke@xxxxxxxxxx>
> ---
>  tools/include/uapi/linux/bpf.h |    9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index da9bf35a26f8..662a15e4a1a1 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -573,8 +573,13 @@ union bpf_attr {
>         } query;
>
>         struct { /* anonymous struct used by BPF_RAW_TRACEPOINT_OPEN command */
> -               __u64 name;
> -               __u32 prog_fd;
> +               __u64           name;
> +               __u32           prog_fd;
> +               __u32           log_level;      /* verbosity level of log */
> +               __u32           log_size;       /* size of user buffer */
> +               __aligned_u64   log_buf;        /* user supplied buffer */
> +               __u32           tgt_prog_fd;
> +               __u32           tgt_btf_id;
>         } raw_tracepoint;
>
>         struct { /* anonymous struct for BPF_BTF_LOAD */
>

I think BPF syscall would benefit from common/generalized log support
across all commands, given how powerful/complex it already is.
Sometimes it's literally impossible to understand why one gets -EINVAL
without adding printk()s in the kernel.

But it feels wrong to add log_level/log_size/log_buf fields to every
section of bpf_attr and require the user to specify it differently for
each command. So before we go and start adding per-command fields,
let's discuss how we can do this more generically. I wonder if we can
come up with a good way to do it in one common way and then gradually
support that way throughout all BPF commands.

Unfortunately it's too late to just add a few common fields to
bpf_attr in front of all other per-command fields, but here's two more
ideas just to start discussion. I hope someone can come up with
something nicer.

1. Currently bpf syscall accepts 3 input arguments (cmd, uattr, size).
We can extend it with two more optional arguments: one for pointer to
log-defining attr (for log_buf pointer, log_size, log_level, maybe
something more later) and another for the size of that log attr.
Beyond that we'd need some way to specify that the user actually meant
to provide those 2 optional args. The most straightforward way would
be to use the highest bit of cmd argument. So instead of calling bpf()
with BPF_MAP_CREATE command, you'd call it with BPF_MAP_CREATE |
BPF_LOGGED, where BPF_LOGGED = 1<<31.

2. A more "stateful" approach, would be to have an extra BPF command
to set log buffer (and level) per thread. And if such a log is set, it
would be overwritten with content on each bpf() syscall invocation
(i.e., log position would be reset to 0 on each BPF syscall).

Of course, the existing BPF_LOAD_PROG command would keep working with
existing log, but could fall back to the "common one", if
BPF_LOAD_PROG-specific one is not set.

It also doesn't seem to be all that critical to signal when the log
buffer is overflown. It's pretty easy to detect from user-space:
- either last byte would be non-zero, if we don't care about
guaranteeing zero-termination for truncated log;
- of second-to-last byte would be non-zero, if BPF syscall will always
zero-terminate the log.

Of course, if user code cares about such detection of log truncation,
it will need to set last/second-to-last bytes to zero before each
syscall.

Both approaches have their pros/cons, we can dig into those later, but
I'd like to start this discussion and see what other people think. I
also wonder if there are other syscalls with similarly advanced input
arguments (like bpf_attr) and common logging, we could learn from
those.

Thoughts?




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux