> On Nov 18, 2022, at 6:25 PM, Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Fri, Nov 18, 2022 at 5:06 PM Song Liu <songliubraving@xxxxxxxx> wrote: >> >> >> >>> On Nov 18, 2022, at 3:45 PM, Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: >>> >>> On Fri, Nov 18, 2022 at 7:40 AM Jiri Olsa <jolsa@xxxxxxxxxx> wrote: >>>> >>>> Adding bpf_vma_build_id_parse function to retrieve build id from >>>> passed vma object and making it available as bpf kfunc. >>>> >>>> We can't use build_id_parse directly as kfunc, because we would >>>> not have control over the build id buffer size provided by user. >>>> >>>> Instead we are adding new bpf_vma_build_id_parse function with >>>> 'build_id__sz' argument that instructs verifier to check for the >>>> available space in build_id buffer. >>>> >>>> This way we check that there's always available memory space >>>> behind build_id pointer. We also check that the build_id__sz is >>>> at least BUILD_ID_SIZE_MAX so we can place any buildid in. >>>> >>>> Signed-off-by: Jiri Olsa <jolsa@xxxxxxxxxx> >>>> --- >>>> include/linux/bpf.h | 4 ++++ >>>> kernel/bpf/verifier.c | 26 ++++++++++++++++++++++++++ >>>> kernel/trace/bpf_trace.c | 31 +++++++++++++++++++++++++++++++ >>>> 3 files changed, 61 insertions(+) >>>> >>>> diff --git a/include/linux/bpf.h b/include/linux/bpf.h >>>> index 8b32376ce746..7648188faa2c 100644 >>>> --- a/include/linux/bpf.h >>>> +++ b/include/linux/bpf.h >>>> @@ -2805,4 +2805,8 @@ static inline bool type_is_alloc(u32 type) >>>> return type & MEM_ALLOC; >>>> } >>>> >>>> +int bpf_vma_build_id_parse(struct vm_area_struct *vma, >>>> + unsigned char *build_id, >>>> + size_t build_id__sz); >>>> + >>>> #endif /* _LINUX_BPF_H */ >>>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c >>>> index 195d24316750..e20bad754a3a 100644 >>>> --- a/kernel/bpf/verifier.c >>>> +++ b/kernel/bpf/verifier.c >>>> @@ -8746,6 +8746,29 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_ >>>> return 0; >>>> } >>>> >>>> +BTF_ID_LIST_SINGLE(bpf_vma_build_id_parse_id, func, bpf_vma_build_id_parse) >>>> + >>>> +static int check_kfunc_caller(struct bpf_verifier_env *env, u32 func_id) >>>> +{ >>>> + struct bpf_func_state *cur; >>>> + struct bpf_insn *insn; >>>> + >>>> + /* Allow bpf_vma_build_id_parse only from bpf_find_vma callback */ >>>> + if (func_id == bpf_vma_build_id_parse_id[0]) { >>>> + cur = env->cur_state->frame[env->cur_state->curframe]; >>>> + if (cur->callsite != BPF_MAIN_FUNC) { >>>> + insn = &env->prog->insnsi[cur->callsite]; >>>> + if (insn->imm == BPF_FUNC_find_vma) >>>> + return 0; >>>> + } >>>> + verbose(env, "calling bpf_vma_build_id_parse outside bpf_find_vma " >>>> + "callback is not allowed\n"); >>>> + return -1; >>>> + } >>>> + >>>> + return 0; >>>> +} >>> >>> I understand that calling bpf_vma_build_id_parse from find_vma >>> is your only use case, but put yourself in the maintainer's shoes. >>> We just did an arbitrary restriction and helped a single user. >>> How are we going to explain this to other users? >>> Let's figure out a more generic way where this call is safe. >>> Have you looked at PTR_TRUSTED approach that David is doing >>> for task_struct ? Can something like this be used here? >> >> I guess that won't work, as the vma is not refcounted. :( This is >> why we have to hold mmap_lock when calling task_vma programs. >> >> OTOH, I would image bpf_vma_build_id_parse being quite useful for >> task_vma programs. > > Of course we cannot increment non-existing refcnt in vma :) > I meant that PTR_TRUSTED part of the concept. The kfunc > bpf_vma_build_id_parse(struct vm_area_struct *vma, ...) > should have KF_TRUSTED_ARGS flag > and it will be the job of the verifier to pass a trusted vma pointer. > Meaning that the verifier needs to guarantee that > the pointer is safe to operate on. > That's what I was explaining to Kumar and David earlier > about KF_TRUSTED_ARGS semantics. > > PTR_TRUSTED doesn't mean that the pointer is refcnted. > It means that it won't disappear and we can safely pass it > to kfunc or helpers. > For bpf_find_vma we can mark vma pointer PTR_TRUSTED on entry > into callback bpf prog and the prog will be able to pass it > to bpf_vma_build_id_parse() kfunc as long as the prog doesn't > add any offset to it. > The implementation of bpf_find_vma() guarantees that vma ptr > passed into callback_fn is valid. > So it's exactly PTR_TRUSTED. > > Similarly task_vma programs will be receiving PTR_TRUSTED pointers too > and will be able to call bpf_vma_build_id_parse() kfunc as well. > Any place where we can guarantee the safety of the pointer > we should be marking it as PTR_TRUSTED. > > David's series start with marking all tp_btf arguments as PTR_TRUSTED. > Doing this for iterators, bpf_find_vma callback > will be a continuation of PTR_TRUSTED logic. I see. So PTR_TRUSTED task_struct is an refcounted task_struct; while PTR_TRUSTED vm_area_struct is a vma with its mm_struct locked. That makes perfect sense. Thanks for the explanation! Song