Re: [RFC PATCH bpf-next 0/3] bpf: add tracing for XDP programs using the BPF_PROG_TEST_RUN API

Yonghong Song <yhs@xxxxxx> · Mon, 23 Mar 2020 15:47:19 -0700

On 3/18/20 6:06 AM, Eelco Chaudron wrote:
I sent out this RFC to get an idea if the approach suggested here
would be something other people would also like to see. In addition,
this cover letter mentions some concerns and questions that need
answers before we can move to an acceptable implementation.

This patch adds support for tracing eBPF XDP programs that get
executed using the __BPF_PROG_RUN syscall. This is done by switching
from JIT (if enabled) to executing the program using the interpreter
and record each executed instruction.

Thanks for working on this! I think this is a useful feature
to do semi single step in a safe environment. The initial input,
e.g., packet or some other kernel context, may be captured
in production error path. People can use this to easily
do some post analysis. This feature can also be used for
initial single-step debugging with better bpftool support.

For now, the execution history is printed to the kernel ring buffer
using pr_info(), the final version should have enough data stored in a
user-supplied buffer to reconstruct this output. This should probably
be part of bpftool, i.e. dump a similar output, and the ability to
store all this in an elf-like format for dumping/analyzing/replaying
at a later stage.

This patch does not dump the XDP packet content before and after
execution, however, this data is available to the caller of the API.

I would like to see the feature is implemented in a way to apply
to all existing test_run program types and extensible to future
program types.

There are different ways to send data back to user. User buffer
is one way, ring buffer is another way, seq_file can also be used.
Performance is not a concern here, so we can choose the one with best
usability.

The __bpf_prog_run_trace() interpreter is a copy of __bpf_prog_run()
and we probably need a smarter way to re-use the code rather than a
blind copy with some changes.

Yes, reusing the code is a must. Using existing interpreter framework
is the easiest for semi single step support.

Enabling the interpreter opens up the kernel for spectre variant 2,
guess that's why the BPF_JIT_ALWAYS_ON option was introduced (commit
290af86629b2). Enabling it for debugging in the field does not sound
like an option (talking to people doing kernel distributions).
Any idea how to work around this (lfence before any call this will
slow down, but I guess for debugging this does not matter)? I need to
research this more as I'm no expert in this area. But I think this
needs to be solved as I see this as a show stopper. So any input is
welcome.

lfence for indirect call is okay here for test_run. Just need to be
careful to no introduce any performance penalty for non-test-run
prog run.

To allow bpf_call support for tracing currently the general
interpreter is enabled. See the fixup_call_args() function for why
this is needed. We might need to find a way to fix this (see the above
section on spectre).

Signed-off-by: Eelco Chaudron <echaudro@xxxxxxxxxx>