Re: [PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs

Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> · Wed, 6 Mar 2024 20:05:05 -0800

On Wed, Mar 6, 2024 at 4:13 AM Christian Brauner <brauner@xxxxxxxxxx> wrote:
>
> On Wed, Mar 06, 2024 at 12:21:28PM +0100, Christian Brauner wrote:
> > On Wed, Mar 06, 2024 at 07:39:14AM +0000, Matt Bobrowski wrote:
> > > G'day All,
> > >
> > > The original cover letter providing background context and motivating
> > > factors around the needs for the BPF kfuncs introduced within this
> > > patch series can be found here [0], so please do reference that if
> > > need be.
> > >
> > > Notably, one of the main contention points within v1 of this patch
> > > series was that we were effectively leaning on some preexisting
> > > in-kernel APIs such as get_task_exe_file() and get_mm_exe_file()
> > > within some of the newly introduced BPF kfuncs. As noted in my
> > > response here [1] though, I struggle to understand the technical
> > > reasoning behind why exposing such in-kernel helpers, specifically
> > > only to BPF LSM program types in the form of BPF kfuncs, is inherently
> > > a terrible idea. So, until someone provides me with a sound technical
> > > explanation as to why this cannot or should not be done, I'll continue
> > > to lean on them. The alternative is to reimplement the necessary
> > > in-kernel APIs within the BPF kfuncs, but that's just nonsensical IMO.
> >
> > You may lean as much as you like. What I've reacted to is that you've
> > (not you specifically, I'm sure) messed up. You've exposed d_path() to
> > users  without understanding that it wasn't safe apparently.
> >
> > And now we get patches that use the self-inflicted brokeness as an
> > argument to expose a bunch of other low-level helpers to fix that.
> >
> > The fact that it's "just bpf LSM" programs doesn't alleviate any
> > concerns whatsoever. Not just because that is just an entry vector but
> > also because we have LSMs induced API abuse that we only ever get to see
> > the fallout from when we refactor apis and then it causes pain for the vfs.
> >
> > I'll take another look at the proposed helpers you need as bpf kfuncs
> > and I'll give my best not to be overly annoyed by all of this. I have no
> > intention of not helping you quite the opposite but I'm annoyed that
> > we're here in the first place.
> >
> > What I want is to stop this madness of exposing stuff to users without
> > fully understanding it's semantics and required guarantees.
>
> So, looking at this series you're now asking us to expose:
>
> (1) mmgrab()
> (2) mmput()
> (3) fput()
> (5) get_mm_exe_file()
> (4) get_task_exe_file()
> (7) get_task_fs_pwd()
> (6) get_task_fs_root()
> (8) path_get()
> (9) path_put()
>
> in one go and the justification in all patches amounts to "This is
> common in some BPF LSM programs".
>
> So, broken stuff got exposed to users or at least a broken BPF LSM
> program was written somewhere out there that is susceptible to UAFs
> becauase you didn't restrict bpf_d_path() to trusted pointer arguments.
> So you're now scrambling to fix this by asking for a bunch of low-level
> exports.
>
> What is the guarantee that you don't end up writing another BPF LSM that
> abuses these exports in a way that causes even more issues and then
> someone else comes back asking for the next round of bpf funcs to be
> exposed to fix it.

There is no guarantee.
We made a safety mistake with bpf_d_path() though
we restricted it very tight. And that UAF is tricky.
I'm still amazed how Jann managed to find it.
We all make mistakes.
It's not the first one and not going to be the last.

What Matt is doing is an honest effort to fix it
in the upstream kernel for all bpf users to benefit.
He could have done it with a kernel module.
The above "low level" helpers are all either static inline
in .h or they call EXPORT_SYMBOL[_GPL] or simply inc/dec refcnt.

One can implement such kfuncs in an out of tree kernel module
and be done with it, but in the bpf community we encourage
everyone to upstream their work.

So kudos to Matt for working on these patches.

His bpf-lsm use case is not special.
It just needs a safe way to call d_path.
Let's look at one of the test in patch 9:

+SEC("lsm.s/file_open")
+__failure __msg("R1 must be referenced or trusted")
+int BPF_PROG(path_d_path_kfunc_untrusted_from_current)
+{
+       struct path *pwd;
+       struct task_struct *current;
+
+       current = bpf_get_current_task_btf();
+       /* Walking a trusted pointer returned from bpf_get_current_task_btf()
+        * yields and untrusted pointer. */
+       pwd = &current->fs->pwd;
+       bpf_path_d_path(pwd, buf, sizeof(buf));
+       return 0;
+}

This test checks that such an access pattern is unsafe and
the verifier will catch it.

To make it safe one needs to do:

  current = bpf_get_current_task_btf();
  pwd = bpf_get_task_fs_pwd(current);
  if (!pwd) // error path
  bpf_path_d_path(pwd, ...);
  bpf_put_path(pwd);

these are the kfuncs from patch 6.

And notice that they have KF_ACQUIRE and KF_RELEASE flags.

They tell the verifier to recognize that bpf_get_task_fs_pwd()
kfunc acquires 'struct path *'.
Meaning that bpf prog cannot just return without releasing it.

The bpf prog cannot use-after-free that 'pwd' either
after it was released by bpf_put_path(pwd).

The verifier static analysis catches such UAF-s.
It didn't catch Jann's UAF earlier, because we didn't have
these kfuncs! Hence the fix is to add such kfuncs with
acquire/release semantics.

> The difference between a regular LSM asking about this and a BPF LSM
> program is that we can see in the hook implementation what the LSM
> intends to do with this and we can judge whether that's safe or not.

See above example.
The verifier is doing a much better job than humans when it comes
to safety.

> Here you're asking us to do this blindfolded.

If you don't trust the verifier to enforce safety,
you shouldn't trust Rust compiler to generate safe code either.

In another reply you've compared kfuncs to EXPORT_SYMBOL_GPL.
Such analogy is correct to some extent,
but unlike exported symbols kfuncs are restricted to particular
program types. They don't accept arbitrary pointers,
and reference count is enforced as well.
That's a pretty big difference vs EXPORT_SYMBOL.

> Because really, all I see immediately supportable is the addition of a
> safe variant of bpf making use of the trusted pointer argument
> constraint:
>
> [PATCH v2 bpf-next 8/9] bpf: add trusted d_path() based BPF kfunc bpf_path_d_path()

This kfunc alone is useful in limited cases.
See above example. For simple LSM file_open hook
the bpf prog needs bpf_get_task_fs_pwd() to use this d_path kfunc.