Re: [PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs

Matt Bobrowski <mattbobrowski@xxxxxxxxxx> · Tue, 12 Mar 2024 17:06:36 +0000

Hey Christian,

On Mon, Mar 11, 2024 at 01:00:56PM +0100, Christian Brauner wrote:
> On Fri, Mar 08, 2024 at 05:23:30PM -0800, Alexei Starovoitov wrote:
> > On Fri, Mar 8, 2024 at 2:36 AM Christian Brauner <brauner@xxxxxxxxxx> wrote:
> > >
> > >
> > > These exports are specifically for an out-of-tree BPF LSM program that
> > > is not accessible to the public. The question in the other mail stands.
> > 
> > The question was already answered. You just don't like the answer.
> > bpf progs are not equivalent to kernel modules.
> > They have completely different safety and visibility properties.
> > The safety part I already talked about.
> > Sounds like the visibility has to be explained.
> > Kernel modules are opaque binary blobs.
> > bpf programs are fully transparent. The intent is known
> > to the verifier and to anyone with understanding
> > of bpf assembly.
> > Those that cannot read bpf asm can read C source code that is
> > embedded in the bpf program in kernel memory.
> > It's not the same as "llvm-dwarfdump module.ko" on disk.
> > The bpf prog source code is loaded into the kernel
> > at program verification time for debugging and visibility reasons.
> > If there is a verifier bug and bpf manages to crash the kernel
> > vmcore will have relevant lines of program C source code right there.
> > 
> > Hence out-of-tree or in-tree bpf makes no practical difference.
> > The program cannot hide its meaning and doesn't hamper debugging.
> > 
> > Hence adding EXPORT_SYMBOL == Brace for impact!
> > Expect crashes, api misuse and what not.
> > 
> > While adding bpf_kfunc is a nop for kernel development.
> > If kfunc is in the way of code refactoring it can be removed
> > (as we demonstrated several times).
> > A kfunc won't cause headaches for the kernel code it is
> > calling (assuming no verifier bugs).
> > If there is a bug it's on us to fix it as we demonstrated in the past.
> > For example: bpf_probe_read_kernel().
> > It's a wrapper of copy_from_kernel_nofault() and over the years
> > bpf users hit various bugs in copy_from_kernel_nofault(),
> > reported them, and _bpf developers_ fixed them.
> > Though copy_from_kernel_nofault() is as generic as it can get
> > and the same bugs could have been reproduced without bpf
> > we took care of fixing these parts of the kernel.
> > 
> > Look at path_put().
> > It's EXPORT_SYMBOL and any kernel module can easily screw up
> > reference counting, so that sooner or later distro folks
> > will experience debug pains due to out-of-tree drivers.
> > 
> > kfunc that calls path_put() won't have such consequences.
> > The verifier will prevent path_put() on a pointer that wasn't
> > acquired by the same bpf program. No support pains.
> > It's a nop for vfs folks.
> > 
> > > > First of all, there is no such thing as get_task_fs_pwd/root
> > > > in the kernel.
> > >
> > > Yeah, we'd need specific helpers for a never seen before out-of-tree BPF
> > > LSM. I don't see how that's different from an out-of-tree kernel module.
> > 
> > Sorry, but you don't seem to understand what bpf can and cannot do,
> > hence they look similar.
> 
> Maybe. On the other hand you seem to ignore what I'm saying. You
> currently don't have a clear set of rules for when it's ok for someone
> to send patches and request access to bpf kfuncs to implement a new BPF
> program. This patchset very much illustrates this point. The safety
> properties of bpf don't matter for this. And again, your safety
> properties very much didn't protect you from your bpf_d_path() mess.
> 
> We're not even clearly told where and how these helper are supposed to be
> used. That's not ok and will never be ok. As long as there are no clear
> criteria to operate under this is highly problematic. This may be fine
> from a bpf perspective and one can even understand why because that's
> apparently your model or promise to your users. But there's no reason to
> expect the same level of laxness from any of the subsystems you're
> requesting kfuncs from.

You raise a completely fair point, and I truly do apologies for the
lack of context and in depth explanations around the specific
situations that the proposed BPF kfuncs are intended to be used
from. Admittedly, that's a failure on my part, and I can completely
understand why from a maintainers point of view there would be
reservations around acknowledging requests for adding such invisible
dependencies.

Now, I'm in a little bit of a tough situation as I'm unable to point
you to an open-source BPF LSM implementation that intends to make use
of such newly proposed BPF kfuncs. That's just an unfortunate
constraint and circumstance that I'm having to deal with, so I'm just
going to have to provide heavily redacted and incomplete example to
illustrate how these BPF kfuncs intend to be used from BPF LSM
programs that I personally work on here at Google. Notably though, the
contexts that I do share here may obviously be a nonholistic view on
how these newly introduced BPF kfuncs end up getting used in practice
by some other completely arbitrary open-source BPF LSM programs.

Anyway, as Alexei had pointed out in one of the prior responses, the
core motivating factor behind introducing these newly proposed BPF
kfuncs purely stems from the requirement of needing to call
bpf_d_path() safely on a struct path from the context of a BPF LSM
program, specifically within the security_file_open() and
security_mmap_file() LSM hooks. Now, as noted within the original bug
report [0], it's currently not considered safe to pluck a struct path
out from an arbitrary in-kernel data structure, which in our case was
current->mm->exe_file->f_path, and have it passed to bpf_d_path() from
the aforementioned LSM hook points, or any other LSM hook point for
that matter.

So, without using these newly introduced BPF kfuncs, our BPF LSM
program hanging off security_file_open() looks as follows:

```
int BPF_PROG(file_open, struct file *file)
{
  // Perform a whole bunch of operations on the supplied file argument. This
  // includes some form of policy evaluation, and if there's a violation against
  // policy and auditing is enabled, then we eventually call bpf_d_path() on
  // file->f_path. Calling bpf_d_path() on the file argument isn't problematic
  // as we have a stable path here as the file argument is reference counted.
  struct path *target = &file->f_path;

  // ...

  struct task_struct *current = bpf_get_current_task_btf();

  // ...

  bpf_rcu_read_lock();
  // Reserve a slot on the BPF ring buffer such that the actor's path can be
  // passed back to userspace.
  void *buf = bpf_ringbuf_reserve(&ringbuf, PATH_MAX, 0);
  if (!buf) {
    goto unlock;
  }

  // For contextual purposes when performing an audit we also call bpf_d_path()
  // on the actor, being current->mm->exe_file->f_path.
  struct path *actor = &current->mm->exe_file->f_path;

  // Now perform the path resolution on the actor via bpf_d_path().
  u64 ret = bpf_d_path(actor, buf, PATH_MAX);
  if (ret > 0) {
    bpf_ringbuf_submit(buf, BPF_RB_NO_WAKEUP);
  } else {
    bpf_ringbuf_discard(buf, 0);
  }

unlock:
  bpf_rcu_read_unlock();
  return 0;
}
```

Post landing these BPF kfuncs, the BPF LSM program hanging off
security_file_open() would be updated to make use of the
acquire/release BPF kfuncs as below. Here I'm only making use of
bpf_get_task_exe_file(), but similar usage also extends to
bpf_get_task_fs_root() and bpf_get_task_fs_pwd().

```
int BPF_PROG(file_open, struct file *file)
{
  // Perform a whole bunch of operations on the supplied file argument. This
  // includes some form of policy evaluation, and if there's a violation against
  // policy and auditing is enabled, then we eventually call bpf_path_d_path()
  // on file->f_path. Calling bpf_path_d_path() on the file argument isn't
  // problematic as we have a stable path here as the file argument is trusted
  // and reference counted.
  struct path *target = &file->f_path;

  // ...

  struct task_struct *current = bpf_get_current_task_btf();

  // ...

  // Reserve a slot on the BPF ring buffer such that the actor's path can be
  // passed back to userspace.
  void *buf = bpf_ringbuf_reserve(&ringbuf, PATH_MAX, 0);
  if (!buf) {
    return 0;
  }

  // For contextual purposes when performing an audit we also call
  // bpf_path_d_path() on the actor, being current->mm->exe_file->f_path.
  // Here we're operating on a stable trused and reference counted file,
  // thanks to bpf_get_task_exe_file().
  struct file *exe_file = bpf_get_task_exe_file(current);
  if (!exe_file) {
    bpf_ringbuf_discard(buf, 0);
    return 0;
  }

  // Now perform the path resolution on the actor via bpf_path_d_path(), which
  // only accepts a trusted struct path.
  u64 ret = bpf_path_d_path(&exe_file->f_path, buf, PATH_MAX);
  if (ret > 0) {
    bpf_ringbuf_submit(buf, BPF_RB_NO_WAKEUP);
  } else {
    bpf_ringbuf_discard(buf, 0);
  }

  // Drop the reference on exe_file.
  bpf_put_file(exe_file);
  return 0;
}
```

This is rather incredibly straightforward, but the fundamental
difference between the two implementations is that one allows us to
work on stable file, whereas the other does not. That's really
it. Similarly, we do more or less the same for our BPF LSM program
that hangs off security_mmap_file().

Do you need anything else which illustrates the proposed BPF kfunc
usage?

[0] https://lore.kernel.org/bpf/CAG48ez0ppjcT=QxU-jtCUfb5xQb3mLr=5FcwddF_VKfEBPs_Dg@xxxxxxxxxxxxxx/

/M