Re: [PATCH bpf-next v2 1/7] bpf: Implement bpf_probe_read_kernel_dynptr helper

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2025/1/28 10:57, Alexei Starovoitov wrote:
On Mon, Jan 27, 2025 at 3:09 PM Andrii Nakryiko
<andrii.nakryiko@xxxxxxxxx> wrote:
On Mon, Jan 27, 2025 at 2:54 PM Andrei Matei <andreimatei1@xxxxxxxxx> wrote:
On Mon, Jan 27, 2025 at 5:04 PM Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
On Sat, Jan 25, 2025 at 5:05 PM Levi Zim <rsworktech@xxxxxxxxxxx> wrote:
On 2025/1/26 00:58, Alexei Starovoitov wrote:
  > On Sat, Jan 25, 2025 at 12:30 AM Levi Zim via B4 Relay
  > <devnull+rsworktech.outlook.com@xxxxxxxxxx> wrote:
  >> From: Levi Zim <rsworktech@xxxxxxxxxxx>
  >>
  >> This patch add a helper function bpf_probe_read_kernel_dynptr:
  >>
  >> long bpf_probe_read_kernel_dynptr(const struct bpf_dynptr *dst,
  >>          u32 offset, u32 size, const void *unsafe_ptr, u64 flags);
  > We stopped adding helpers years ago.
  > Only new kfuncs are allowed.

Sorry, I didn't know that. Just asking, is there any
documentation/discussion
about stopping adding helpers?

I will switch the implementation to kfuncs in v3.

  > This particular one doesn't look useful as-is.
  > The same logic can be expressed with
  > - create dynptr
  > - dynptr_slice
  > - copy_from_kernel

By copy_from_kernel I assume you mean bpf_probe_read_kernel. The problem
with dynptr_slice_rdwr and probe_read_kernel is that they only support a
compile-time constant size [1].

But in order to best utilize the space on a BPF ringbuf, it is possible
to reserve a
variable length of space as dynptr on a ringbuf with
bpf_ringbuf_reserve_dynptr.
For our uprobes, we've run into similar issues around doing variable-sized
bpf_probe_read_user() into ring buffers for our debugger [1]. Our use case
is that we generate uprobes that recursively read data structures until we
fill up a buffer. The verifier's insistence on knowing statically that a read
fits into the buffer makes for awkward code, and makes it hard to pack the
buffer fully; we have to split our reads into a couple of static size classes.

Any chance there'd be interest in taking the opportunity to support
dynamically-sized reads from userspace too? :)
That's bpf_probe_read_user_dynptr() from patch #2, no?

But generally speaking, here's a list of new APIs that we'd need to
cover all existing fixed buffer versions:

- non-sleepable probe reads:

   bpf_probe_read_kernel_dynptr()
   bpf_probe_read_user_dynptr()
   bpf_probe_read_kernel_str_dynptr()
   bpf_probe_read_user_str_dynptr()

- sleepable probe reads (copy_from_user):

bpf_copy_from_user_dynptr()
bpf_copy_from_user_str_dynptr()

- and then we have complementary task-based APIs for non-current process:

bpf_probe_read_user_task_dynptr()
bpf_probe_read_user_str_task_dynptr()
bpf_copy_from_user_task_dynptr()
bpf_copy_from_user_str_task_dynptr()

Jordan is working on non-dynptr version of
bpf_copy_from_user_str_task(), once he's done with that, we'll add
dynptr version, probably.
This is quite a bunch of kfuncs.
It doesn't look like adding _dynptr suffix and duplicating
kfuncs approach scales.

The _str_dynptr versions might not worth adding [1].
So only four read_{kernel,user}_dynptr and copy_from_user{,_task}_dynptr are needed,
which seems manageable for now.

But taking other helpers like bpf_strtol into account does quickly show that this approach
is not scalable.

Let's make the existing helpers/kfuncs more flexible ?

We can introduce a kfunc bpf_dynptr_buf() that checks that
dynptr is not readonly and type == local or ringbuf and
return dynptr->data as PTR_TO_MEM | dynptr_flag | VERIFIER_ADDS_SIZE_CHECK.

Then allow bpf_probe_read_user/kernel/... all of them to accept
this register type where PTR_TO_MEM is required
while relaxing ARG_CONST_SIZE 2nd argument to ARG_ANYTHING.
Then the verifier will insert an extra check
if (arg1->size < arg2)
before the call.
Nice idea. I will try this approach first.

Not only the bpf_probe_read_kernel/user, _str variants will work
but things like bpf_strtol, bpf_strncmp, bpf_snprintf, bpf_get_stack
will auto-magically work as well.

I think those are quite valuable to make available with non-constant size.
bpf_get_stack_*() directly into the ring buffer sounds very useful.

[1]: https://lore.kernel.org/bpf/20250125-bpf_dynptr_probe-v2-0-c42c87f97afe@xxxxxxxxxxx/T/#m9700146d286a88abc0b25ef47041015ba6c477a3





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux