Re: [PATCH v3 bpf-next 1/3] bpf: introduce new VFS based BPF kfuncs

Matt Bobrowski <mattbobrowski@xxxxxxxxxx> · Fri, 26 Jul 2024 20:31:27 +0000

On Fri, Jul 26, 2024 at 03:18:25PM +0200, Christian Brauner wrote:
> On Fri, Jul 26, 2024 at 08:56:02AM GMT, Matt Bobrowski wrote:
> > Add a new variant of bpf_d_path() named bpf_path_d_path() which takes
> > the form of a BPF kfunc and enforces KF_TRUSTED_ARGS semantics onto
> > its arguments.
> > 
> > This new d_path() based BPF kfunc variant is intended to address the
> > legacy bpf_d_path() BPF helper's susceptibility to memory corruption
> > issues [0, 1, 2] by ensuring to only operate on supplied arguments
> > which are deemed trusted by the BPF verifier. Typically, this means
> > that only pointers to a struct path which have been referenced counted
> > may be supplied.
> > 
> > In addition to the new bpf_path_d_path() BPF kfunc, we also add a
> > KF_ACQUIRE based BPF kfunc bpf_get_task_exe_file() and KF_RELEASE
> > counterpart BPF kfunc bpf_put_file(). This is so that the new
> > bpf_path_d_path() BPF kfunc can be used more flexibility from within
> > the context of a BPF LSM program. It's rather common to ascertain the
> > backing executable file for the calling process by performing the
> > following walk current->mm->exe_file while instrumenting a given
> > operation from the context of the BPF LSM program. However, walking
> > current->mm->exe_file directly is never deemed to be OK, and doing so
> > from both inside and outside of BPF LSM program context should be
> > considered as a bug. Using bpf_get_task_exe_file() and in turn
> > bpf_put_file() will allow BPF LSM programs to reliably get and put
> > references to current->mm->exe_file.
> > 
> > As of now, all the newly introduced BPF kfuncs within this patch are
> > limited to sleepable BPF LSM program types. Therefore, they may only
> > be called when a BPF LSM program is attached to one of the listed
> > attachment points defined within the sleepable_lsm_hooks BTF ID set.
> > 
> > [0] https://lore.kernel.org/bpf/CAG48ez0ppjcT=QxU-jtCUfb5xQb3mLr=5FcwddF_VKfEBPs_Dg@xxxxxxxxxxxxxx/
> > [1] https://lore.kernel.org/bpf/20230606181714.532998-1-jolsa@xxxxxxxxxx/
> > [2] https://lore.kernel.org/bpf/20220219113744.1852259-1-memxor@xxxxxxxxx/
> > 
> > Signed-off-by: Matt Bobrowski <mattbobrowski@xxxxxxxxxx>
> > ---
> >  fs/Makefile        |   1 +
> >  fs/bpf_fs_kfuncs.c | 133 +++++++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 134 insertions(+)
> >  create mode 100644 fs/bpf_fs_kfuncs.c
> > 
> > diff --git a/fs/Makefile b/fs/Makefile
> > index 6ecc9b0a53f2..61679fd587b7 100644
> > --- a/fs/Makefile
> > +++ b/fs/Makefile
> > @@ -129,3 +129,4 @@ obj-$(CONFIG_EFIVAR_FS)		+= efivarfs/
> >  obj-$(CONFIG_EROFS_FS)		+= erofs/
> >  obj-$(CONFIG_VBOXSF_FS)		+= vboxsf/
> >  obj-$(CONFIG_ZONEFS_FS)		+= zonefs/
> > +obj-$(CONFIG_BPF_LSM)		+= bpf_fs_kfuncs.o
> > diff --git a/fs/bpf_fs_kfuncs.c b/fs/bpf_fs_kfuncs.c
> > new file mode 100644
> > index 000000000000..3813e2a83313
> > --- /dev/null
> > +++ b/fs/bpf_fs_kfuncs.c
> > @@ -0,0 +1,133 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/* Copyright (c) 2024 Google LLC. */
> > +
> > +#include <linux/bpf.h>
> > +#include <linux/btf.h>
> > +#include <linux/btf_ids.h>
> > +#include <linux/dcache.h>
> > +#include <linux/err.h>
> > +#include <linux/fs.h>
> > +#include <linux/file.h>
> > +#include <linux/init.h>
> > +#include <linux/mm.h>
> > +#include <linux/path.h>
> > +#include <linux/sched.h>
> > +
> > +__bpf_kfunc_start_defs();
> > +/**
> > + * bpf_get_task_exe_file - get a reference on the exe_file struct file member of
> > + *                         the mm_struct that is nested within the supplied
> > + *                         task_struct
> > + * @task: task_struct of which the nested mm_struct exe_file member to get a
> > + * reference on
> > + *
> > + * Get a reference on the exe_file struct file member field of the mm_struct
> > + * nested within the supplied *task*. The referenced file pointer acquired by
> > + * this BPF kfunc must be released using bpf_put_file(). Failing to call
> > + * bpf_put_file() on the returned referenced struct file pointer that has been
> > + * acquired by this BPF kfunc will result in the BPF program being rejected by
> > + * the BPF verifier.
> > + *
> > + * This BPF kfunc may only be called from sleepable BPF LSM programs.
> > + *
> > + * Internally, this BPF kfunc leans on get_task_exe_file(), such that calling
> > + * bpf_get_task_exe_file() would be analogous to calling get_task_exe_file()
> > + * directly in kernel context.
> > + *
> > + * Return: A referenced struct file pointer to the exe_file member of the
> > + * mm_struct that is nested within the supplied *task*. On error, NULL is
> > + * returned.
> > + */
> > +__bpf_kfunc struct file *bpf_get_task_exe_file(struct task_struct *task)
> > +{
> > +	return get_task_exe_file(task);
> > +}
> > +
> > +/**
> > + * bpf_put_file - put a reference on the supplied file
> > + * @file: file to put a reference on
> > + *
> > + * Put a reference on the supplied *file*. Only referenced file pointers may be
> > + * passed to this BPF kfunc. Attempting to pass an unreferenced file pointer, or
> > + * any other arbitrary pointer for that matter, will result in the BPF program
> > + * being rejected by the BPF verifier.
> > + *
> > + * This BPF kfunc may only be called from sleepable BPF LSM programs. Though
> > + * fput() can be called from IRQ context, we're enforcing sleepability here.
> > + */
> > +__bpf_kfunc void bpf_put_file(struct file *file)
> > +{
> > +	fput(file);
> > +}
> > +
> > +/**
> > + * bpf_path_d_path - resolve the pathname for the supplied path
> > + * @path: path to resolve the pathname for
> > + * @buf: buffer to return the resolved pathname in
> > + * @buf__sz: length of the supplied buffer
> > + *
> > + * Resolve the pathname for the supplied *path* and store it in *buf*. This BPF
> > + * kfunc is the safer variant of the legacy bpf_d_path() helper and should be
> > + * used in place of bpf_d_path() whenever possible. It enforces KF_TRUSTED_ARGS
> > + * semantics, meaning that the supplied *path* must itself hold a valid
> > + * reference, or else the BPF program will be outright rejected by the BPF
> > + * verifier.
> > + *
> > + * This BPF kfunc may only be called from sleepable BPF LSM programs.
> > + *
> > + * Return: A positive integer corresponding to the length of the resolved
> > + * pathname in *buf*, including the NUL termination character. On error, a
> > + * negative integer is returned.
> > + */
> > +__bpf_kfunc int bpf_path_d_path(struct path *path, char *buf, size_t buf__sz)
> > +{
> > +	int len;
> > +	char *ret;
> > +
> > +	if (buf__sz <= 0)
> > +		return -EINVAL;
> 
> size_t is unsigned so this should just be !buf__sz I can fix that
> though.

Sure, that would be great if you wouldn't mind?

> The __sz thing has meaning to the verifier afaict so I guess that's
> fine as name then.

That's right, it's used to signal that a buffer and it's associated
size exists within the BPF kfuncs argument list. Using the __sz
annotation specifically allows the BPF verifier to deduce which size
argument is meant to be bounded to a given buffer.

/M