On Mon, Jul 13, 2020 at 1:13 PM Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote: > > From: Toke Høiland-Jørgensen <toke@xxxxxxxxxx> > > Sync addition of new members from main kernel tree. > > Signed-off-by: Toke Høiland-Jørgensen <toke@xxxxxxxxxx> > --- > tools/include/uapi/linux/bpf.h | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h > index da9bf35a26f8..662a15e4a1a1 100644 > --- a/tools/include/uapi/linux/bpf.h > +++ b/tools/include/uapi/linux/bpf.h > @@ -573,8 +573,13 @@ union bpf_attr { > } query; > > struct { /* anonymous struct used by BPF_RAW_TRACEPOINT_OPEN command */ > - __u64 name; > - __u32 prog_fd; > + __u64 name; > + __u32 prog_fd; > + __u32 log_level; /* verbosity level of log */ > + __u32 log_size; /* size of user buffer */ > + __aligned_u64 log_buf; /* user supplied buffer */ > + __u32 tgt_prog_fd; > + __u32 tgt_btf_id; > } raw_tracepoint; > > struct { /* anonymous struct for BPF_BTF_LOAD */ > I think BPF syscall would benefit from common/generalized log support across all commands, given how powerful/complex it already is. Sometimes it's literally impossible to understand why one gets -EINVAL without adding printk()s in the kernel. But it feels wrong to add log_level/log_size/log_buf fields to every section of bpf_attr and require the user to specify it differently for each command. So before we go and start adding per-command fields, let's discuss how we can do this more generically. I wonder if we can come up with a good way to do it in one common way and then gradually support that way throughout all BPF commands. Unfortunately it's too late to just add a few common fields to bpf_attr in front of all other per-command fields, but here's two more ideas just to start discussion. I hope someone can come up with something nicer. 1. Currently bpf syscall accepts 3 input arguments (cmd, uattr, size). We can extend it with two more optional arguments: one for pointer to log-defining attr (for log_buf pointer, log_size, log_level, maybe something more later) and another for the size of that log attr. Beyond that we'd need some way to specify that the user actually meant to provide those 2 optional args. The most straightforward way would be to use the highest bit of cmd argument. So instead of calling bpf() with BPF_MAP_CREATE command, you'd call it with BPF_MAP_CREATE | BPF_LOGGED, where BPF_LOGGED = 1<<31. 2. A more "stateful" approach, would be to have an extra BPF command to set log buffer (and level) per thread. And if such a log is set, it would be overwritten with content on each bpf() syscall invocation (i.e., log position would be reset to 0 on each BPF syscall). Of course, the existing BPF_LOAD_PROG command would keep working with existing log, but could fall back to the "common one", if BPF_LOAD_PROG-specific one is not set. It also doesn't seem to be all that critical to signal when the log buffer is overflown. It's pretty easy to detect from user-space: - either last byte would be non-zero, if we don't care about guaranteeing zero-termination for truncated log; - of second-to-last byte would be non-zero, if BPF syscall will always zero-terminate the log. Of course, if user code cares about such detection of log truncation, it will need to set last/second-to-last bytes to zero before each syscall. Both approaches have their pros/cons, we can dig into those later, but I'd like to start this discussion and see what other people think. I also wonder if there are other syscalls with similarly advanced input arguments (like bpf_attr) and common logging, we could learn from those. Thoughts?