On 07/21, kuifeng@xxxxxxxx wrote: > From: Kui-Feng Lee <kuifeng@xxxxxxxx> > > Enable sleepable cgroup/{get,set}sockopt hooks. > > The sleepable BPF programs attached to cgroup/{get,set}sockopt hooks may > received a pointer to the optval in user space instead of a kernel > copy. ctx->user_optval and ctx->user_optval_end are the pointers to the > begin and end of the user space buffer if receiving a user space > buffer. ctx->optval and ctx->optval_end will be a kernel copy if receiving > a kernel space buffer. > > A program receives a user space buffer if ctx->flags & > BPF_SOCKOPT_FLAG_OPTVAL_USER is true, otherwise it receives a kernel space > buffer. The BPF programs should not read/write from/to a user space buffer > dirrectly. It should access the buffer through bpf_copy_from_user() and > bpf_copy_to_user() provided in the following patches. > > Signed-off-by: Kui-Feng Lee <kuifeng@xxxxxxxx> > --- > include/linux/filter.h | 3 + > include/uapi/linux/bpf.h | 9 ++ > kernel/bpf/cgroup.c | 189 ++++++++++++++++++++++++++------- > kernel/bpf/verifier.c | 7 +- > tools/include/uapi/linux/bpf.h | 9 ++ > tools/lib/bpf/libbpf.c | 2 + > 6 files changed, 176 insertions(+), 43 deletions(-) > > diff --git a/include/linux/filter.h b/include/linux/filter.h > index f69114083ec7..301dd1ba0de1 100644 > --- a/include/linux/filter.h > +++ b/include/linux/filter.h > @@ -1345,6 +1345,9 @@ struct bpf_sockopt_kern { > s32 level; > s32 optname; > s32 optlen; > + u32 flags; > + u8 *user_optval; > + u8 *user_optval_end; > /* for retval in struct bpf_cg_run_ctx */ > struct task_struct *current_task; > /* Temporary "register" for indirect stores to ppos. */ > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > index 739c15906a65..b2f81193f97b 100644 > --- a/include/uapi/linux/bpf.h > +++ b/include/uapi/linux/bpf.h > @@ -7135,6 +7135,15 @@ struct bpf_sockopt { > __s32 optname; > __s32 optlen; > __s32 retval; > + > + __bpf_md_ptr(void *, user_optval); > + __bpf_md_ptr(void *, user_optval_end); Can we re-purpose existing optval/optval_end pointers for the sleepable programs? IOW, when the prog is sleepable, pass user pointers via optval/optval_end and require the programs to do copy_to/from on this buffer (even if the backing pointer might be in kernel memory - we can handle that in the kfuncs?). The fact that the program now needs to look at the flag (BPF_SOCKOPT_FLAG_OPTVAL_USER) and decide which buffer to use makes the handling even more complicated; and we already have a bunch of hairy stuff in these hooks. (or I misreading the change?) Also, regarding sleepable and non-sleepable co-existence: do we really need that? Can we say that all the programs have to be sleepable or non-sleepable? Mixing them complicates the sharing of that buffer.