Hi, On 8/6/2024 8:31 AM, Amery Hung wrote: > On Mon, Aug 5, 2024 at 3:25 PM Martin KaFai Lau <martin.lau@xxxxxxxxx> wrote: >> On 8/5/24 1:44 PM, Amery Hung wrote: >>>>>>> Maybe we should move the common btf used by kptr and graph_root into >>>>>>> btf_record and let the callers of btf_parse_fields() and >>>>>>> btf_record_free() to decide the life cycle of btf in btf_record. >>>>>> Could you maybe explain if and why moving btf of btf_field_kptr and >>>>>> btf_field_graph_root to btf_record is necessary? I think letting >>>>>> callers of btf_parse_fields() and btf_record_free() decide whether or >>>>>> not to change refcount should be enough. Besides, I personally would >>>>>> like to keep individual btf in btf_field_kptr and >>>>>> btf_field_graph_root, so that later we can have special fields >>>>>> referencing different btf. >>>>> Sorry, I didn't express the rough idea clearly enough. I didn't mean to >>>>> move btf of btf_field_kptr and btf_field_graph_root to btf_record, >>>>> because there are other btf-s which are different with the btf which >>>>> creates the struct_meta_tab. What I was trying to suggest is to save one >>>>> btf in btf_record and hope it will simplify the pin and the unpin of btf >>>>> in btf_record: >>>>> >>>>> 1) save the btf which owns the btf_record in btf_record. >>>>> 2) during btf_parse_kptr() or similar, if the used btf is the same as >>>>> the btf in btf_record, there is no need to pin the btf >>>> I assume the used btf is the one that btf_parse is working on. >>>> >>>>> 3) when freeing the btf_record, if the btf saved in btf_field is the >>>>> same as the btf in btf_record, there is no need to put it >>>> For btf_field_kptr.btf, is it the same as testing the btf_field_kptr.btf is >>>> btf_is_kernel() or not? How about only does btf_get/put for btf_is_kernel()? >>>> >>> IIUC. It will not be the same. For a map referencing prog btf, I >>> suppose we should still do btf_get(). >>> >>> I think the core idea is since a btf_record and the prog btf >>> containing it has the same life time, we don't need to >>> btf_get()/btf_put() in btf_parse_kptr()/btf_record_free() when a >>> btf_field_kptr.btf is referencing itself. >>> >>> However, since btf_parse_kptr() called from btf_parse() and >>> map_check_btf() all use prog btf, we need a way to differentiate the >>> two. Hence Hou suggested saving the owner's btf in btf_record, and >> map_check_btf() calls btf_parse_kptr(map->btf). >> >> I am missing how it is different from the >> btf_new_fd()=>btf_parse()=>btf_parse_kptr(new_btf). >> >> akaik, the map->record has no issue now because bpf_map_free_deferred() does >> btf_record_free(map->record) before btf_put(map->btf). In the map->record case, >> does the map->record need to take a refcnt of the btf_field_kptr.btf if the >> btf_field_kptr.btf is pointing back to itself (map->btf) which is not a kernel btf? I think you are right. For both callees of btf_parse_kptr() when the btf saved in btf_field_kptr.btf is the same as the passed btf, there is no need to call btf_get(). For bpf map case, just as you remained, map->btf has already pin the btf passed to btf_parse_kptr() in map_create(). >>> then check if btf_record->btf is the same as the btf_field_kptr.btf in >>> btf_parse_kptr()/btf_record_free(). >> I suspect it will have the same end result? The btf_field_kptr.btf is only the >> same as the owner's btf when btf_parse_kptr() cannot found the kptr type from a >> kernel's btf (the id == -ENOENT case in btf_parse_kptr). It seems your suggestion of using btf_is_kernel() is much simpler: (1) btf_parse_kptr(): doesn't invoke btf_get() when bpf_find_btf_id() returns -ENOENT and let the caller ensure the life cycle of the passed btf. (2) btf_record_free(): only invoke btf_put() if the saved btf is a kernel btf. (3) btf_record_dup(): only invoke btf_get() if the saved btf is a kernel btf. > I added some code to better explain how I think it could work. > > I am thinking about adding a struct btf* under struct btf_record, and > then adding an argument in btf_parse_fields: > > btf_parse_fields(const struct btf *btf,..., bool btf_is_owner) > { > ... > /* Before the for loop that goes through info_arr */ > rec->btf = btf_is_owner ? btf : NULL; > ... > } > > The btf_is_owner indicates whether the btf_record returned by the > function will be part of the btf. So map_check_btf() will set it to > false while btf_parse() will set it to true. Maybe "owner" is what > makes it confusing? (btf_record for a map belongs to bpf_map but not > map->btf. On the other hand, btf_record for a prog btf is part of it.) > > Then, in btf_record_free(), we can do: > > case BPF_KPTR_XXX: > ... > if (rec->btf != rec->fields[i].kptr.btf) > btf_put(rec->fields[i].kptr.btf); > > In btf_parse_kptr(), we also do similar things. We just need to pass > btf_record into it.