Re: [RFC bpf-next 1/5] bpf: enable sleepable BPF programs attached to cgroup/{get,set}sockopt.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jul 31, 2023 at 3:02 PM Kui-Feng Lee <sinquersw@xxxxxxxxx> wrote:
>
> Sorry for the late reply! I just backed from a vacation.
>
>
> On 7/24/23 11:36, Stanislav Fomichev wrote:
> > On 07/21, kuifeng@xxxxxxxx wrote:
> >> From: Kui-Feng Lee <kuifeng@xxxxxxxx>
> >>
> >> Enable sleepable cgroup/{get,set}sockopt hooks.
> >>
> >> The sleepable BPF programs attached to cgroup/{get,set}sockopt hooks may
> >> received a pointer to the optval in user space instead of a kernel
> >> copy. ctx->user_optval and ctx->user_optval_end are the pointers to the
> >> begin and end of the user space buffer if receiving a user space
> >> buffer. ctx->optval and ctx->optval_end will be a kernel copy if receiving
> >> a kernel space buffer.
> >>
> >> A program receives a user space buffer if ctx->flags &
> >> BPF_SOCKOPT_FLAG_OPTVAL_USER is true, otherwise it receives a kernel space
> >> buffer.  The BPF programs should not read/write from/to a user space buffer
> >> dirrectly.  It should access the buffer through bpf_copy_from_user() and
> >> bpf_copy_to_user() provided in the following patches.
> >>
> >> Signed-off-by: Kui-Feng Lee <kuifeng@xxxxxxxx>
> >> ---
> >>   include/linux/filter.h         |   3 +
> >>   include/uapi/linux/bpf.h       |   9 ++
> >>   kernel/bpf/cgroup.c            | 189 ++++++++++++++++++++++++++-------
> >>   kernel/bpf/verifier.c          |   7 +-
> >>   tools/include/uapi/linux/bpf.h |   9 ++
> >>   tools/lib/bpf/libbpf.c         |   2 +
> >>   6 files changed, 176 insertions(+), 43 deletions(-)
> >>
> >> diff --git a/include/linux/filter.h b/include/linux/filter.h
> >> index f69114083ec7..301dd1ba0de1 100644
> >> --- a/include/linux/filter.h
> >> +++ b/include/linux/filter.h
> >> @@ -1345,6 +1345,9 @@ struct bpf_sockopt_kern {
> >>      s32             level;
> >>      s32             optname;
> >>      s32             optlen;
> >> +    u32             flags;
> >> +    u8              *user_optval;
> >> +    u8              *user_optval_end;
> >>      /* for retval in struct bpf_cg_run_ctx */
> >>      struct task_struct *current_task;
> >>      /* Temporary "register" for indirect stores to ppos. */
> >> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> >> index 739c15906a65..b2f81193f97b 100644
> >> --- a/include/uapi/linux/bpf.h
> >> +++ b/include/uapi/linux/bpf.h
> >> @@ -7135,6 +7135,15 @@ struct bpf_sockopt {
> >>      __s32   optname;
> >>      __s32   optlen;
> >>      __s32   retval;
> >> +
> >> +    __bpf_md_ptr(void *, user_optval);
> >> +    __bpf_md_ptr(void *, user_optval_end);
> >
> > Can we re-purpose existing optval/optval_end pointers
> > for the sleepable programs? IOW, when the prog is sleepable,
> > pass user pointers via optval/optval_end and require the programs
> > to do copy_to/from on this buffer (even if the backing pointer might be
> > in kernel memory - we can handle that in the kfuncs?).
> >
> > The fact that the program now needs to look at the flag
> > (BPF_SOCKOPT_FLAG_OPTVAL_USER) and decide which buffer to
> > use makes the handling even more complicated; and we already have a
> > bunch of hairy stuff in these hooks. (or I misreading the change?)
> >
> > Also, regarding sleepable and non-sleepable co-existence: do we really need
> > that? Can we say that all the programs have to be sleepable
> > or non-sleepable? Mixing them complicates the sharing of that buffer.
>
> I considered this approach as well. This is an open question for me.
> If we go this way, it means we can not attach a BPF program of a type
> if any program of the other type has been installed.

If we pass two pointers (kernel copy buffer + real user mem) to the
sleepable program, we'll make it even more complicated by inheriting
all existing warts of the non-sleepable version :-(
IOW, feels like we should try to see if we can have some
copy_to/from_user kfuncs in the sleepable version that transparently
support either kernel or user memory (and prohibit direct access to
user_optval in the sleepable version).
And then, if we have one non-sleepable program in the chain, we can
fallback everything to the kernel buffer (maybe).
This way seems like we can support both versions in the same chain and
have a more sane api?





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux