Re: [PATCH v2 bpf-next 1/4] bpf: Search for kptrs in prog BTF structs

Amery Hung <ameryhung@xxxxxxxxx> · Mon, 5 Aug 2024 21:07:14 -0700

On Mon, Aug 5, 2024 at 6:57 PM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote:
>
> Hi,
>
> On 8/6/2024 8:31 AM, Amery Hung wrote:
> > On Mon, Aug 5, 2024 at 3:25 PM Martin KaFai Lau <martin.lau@xxxxxxxxx> wrote:
> >> On 8/5/24 1:44 PM, Amery Hung wrote:
> >>>>>>> Maybe we should move the common btf used by kptr and graph_root into
> >>>>>>> btf_record and let the callers of btf_parse_fields() and
> >>>>>>> btf_record_free() to decide the life cycle of btf in btf_record.
> >>>>>> Could you maybe explain if and why moving btf of btf_field_kptr and
> >>>>>> btf_field_graph_root to btf_record is necessary? I think letting
> >>>>>> callers of btf_parse_fields() and btf_record_free() decide whether or
> >>>>>> not to change refcount should be enough. Besides, I personally would
> >>>>>> like to keep individual btf in btf_field_kptr and
> >>>>>> btf_field_graph_root, so that later we can have special fields
> >>>>>> referencing different btf.
> >>>>> Sorry, I didn't express the rough idea clearly enough. I didn't mean to
> >>>>> move btf of btf_field_kptr and btf_field_graph_root to btf_record,
> >>>>> because there are other btf-s which are different with the btf which
> >>>>> creates the struct_meta_tab. What I was trying to suggest is to save one
> >>>>> btf in btf_record and hope it will simplify the pin and the unpin of btf
> >>>>> in btf_record:
> >>>>>
> >>>>> 1) save the btf which owns the btf_record in btf_record.
> >>>>> 2) during btf_parse_kptr() or similar, if the used btf is the same as
> >>>>> the btf in btf_record, there is no need to pin the btf
> >>>> I assume the used btf is the one that btf_parse is working on.
> >>>>
> >>>>> 3) when freeing the btf_record, if the btf saved in btf_field is the
> >>>>> same as the btf in btf_record, there is no need to put it
> >>>> For btf_field_kptr.btf, is it the same as testing the btf_field_kptr.btf is
> >>>> btf_is_kernel() or not? How about only does btf_get/put for btf_is_kernel()?
> >>>>
> >>> IIUC. It will not be the same. For a map referencing prog btf, I
> >>> suppose we should still do btf_get().
> >>>
> >>> I think the core idea is since a btf_record and the prog btf
> >>> containing it has the same life time, we don't need to
> >>> btf_get()/btf_put() in btf_parse_kptr()/btf_record_free() when a
> >>> btf_field_kptr.btf is referencing itself.
> >>>
> >>> However, since btf_parse_kptr() called from btf_parse() and
> >>> map_check_btf() all use prog btf, we need a way to differentiate the
> >>> two. Hence Hou suggested saving the owner's btf in btf_record, and
> >> map_check_btf() calls btf_parse_kptr(map->btf).
> >>
> >> I am missing how it is different from the
> >> btf_new_fd()=>btf_parse()=>btf_parse_kptr(new_btf).
> >>
> >> akaik, the map->record has no issue now because bpf_map_free_deferred() does
> >> btf_record_free(map->record) before btf_put(map->btf). In the map->record case,
> >> does the map->record need to take a refcnt of the btf_field_kptr.btf if the
> >> btf_field_kptr.btf is pointing back to itself (map->btf) which is not a kernel btf?
>
> I think you are right. For both callees of btf_parse_kptr() when the btf
> saved in btf_field_kptr.btf is the same as the passed btf, there is no
> need to call btf_get(). For bpf map case, just as you remained, map->btf
> has already pin the btf passed to btf_parse_kptr() in map_create().
> >>> then check if btf_record->btf is the same as the btf_field_kptr.btf in
> >>> btf_parse_kptr()/btf_record_free().
> >> I suspect it will have the same end result? The btf_field_kptr.btf is only the
> >> same as the owner's btf when btf_parse_kptr() cannot found the kptr type from a
> >> kernel's btf (the id == -ENOENT case in btf_parse_kptr).
>
> It seems your suggestion of using btf_is_kernel() is much simpler:
>
> (1) btf_parse_kptr():  doesn't invoke btf_get() when bpf_find_btf_id()
> returns -ENOENT and let the caller ensure the life cycle of the passed btf.
> (2) btf_record_free(): only invoke btf_put() if the saved btf is a
> kernel btf.
> (3) btf_record_dup(): only invoke btf_get() if the saved btf is a kernel
> btf.

I see. It is much simpler. I will send a new version using this approach.

Thank you,
Amery

> > I added some code to better explain how I think it could work.
> >
> > I am thinking about adding a struct btf* under struct btf_record, and
> > then adding an argument in btf_parse_fields:
> >
> > btf_parse_fields(const struct btf *btf,..., bool btf_is_owner)
> > {
> >         ...
> >         /* Before the for loop that goes through info_arr */
> >         rec->btf = btf_is_owner ? btf : NULL;
> >         ...
> > }
> >
> > The btf_is_owner indicates whether the btf_record returned by the
> > function will be part of the btf. So map_check_btf() will set it to
> > false while btf_parse() will set it to true. Maybe "owner" is what
> > makes it confusing? (btf_record for a map belongs to bpf_map but not
> > map->btf. On the other hand, btf_record for a prog btf is part of it.)
> >
> > Then, in btf_record_free(), we can do:
> >
> > case BPF_KPTR_XXX:
> >         ...
> >         if (rec->btf != rec->fields[i].kptr.btf)
> >                 btf_put(rec->fields[i].kptr.btf);
> >
> > In btf_parse_kptr(), we also do similar things. We just need to pass
> > btf_record into it.
>