On Mon, Aug 5, 2024 at 1:00 PM Martin KaFai Lau <martin.lau@xxxxxxxxx> wrote: > > On 8/5/24 12:32 AM, Hou Tao wrote: > > Hi, > > > > On 8/5/2024 12:31 PM, Amery Hung wrote: > >> On Sun, Aug 4, 2024 at 7:41 PM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote: > >>> Hi, > >>> > >>> On 8/3/2024 8:11 AM, Amery Hung wrote: > >>>> From: Dave Marchevsky <davemarchevsky@xxxxxx> > >>>> > >>>> Currently btf_parse_fields is used in two places to create struct > >>>> btf_record's for structs: when looking at mapval type, and when looking > >>>> at any struct in program BTF. The former looks for kptr fields while the > >>>> latter does not. This patch modifies the btf_parse_fields call made when > >>>> looking at prog BTF struct types to search for kptrs as well. > >>>> > >>> SNIP > >>>> On a side note, when building program BTF, the refcount of program BTF > >>>> is now initialized before btf_parse_struct_metas() to prevent a > >>>> refcount_inc() on zero warning. This happens when BPF_KPTR is present > >>>> in program BTF: btf_parse_struct_metas() -> btf_parse_fields() > >>>> -> btf_parse_kptr() -> btf_get(). This should be okay as the program BTF > >>>> is not available yet outside btf_parse(). > >>> If btf_parse_kptr() pins the passed btf, there will be memory leak for > >>> the btf after closing the btf fd, because the invocation of btf_put() > >>> for kptr record in btf->struct_meta_tab depends on the invocation of > >>> btf_free_struct_meta_tab() in btf_free(), but the invocation of > >>> btf_free() depends the final refcnt of the btf is released, so the btf > >>> will not be freed forever. The reason why map value doesn't have such > >>> problem is that the invocation of btf_put() for kptr record doesn't > >>> depends on the release of map value btf and it is accomplished by > >>> bpf_map_free_record(). > >>> > >> Thanks for pointing this out. It makes sense to me. > >> > >>> Maybe we should move the common btf used by kptr and graph_root into > >>> btf_record and let the callers of btf_parse_fields() and > >>> btf_record_free() to decide the life cycle of btf in btf_record. > >> Could you maybe explain if and why moving btf of btf_field_kptr and > >> btf_field_graph_root to btf_record is necessary? I think letting > >> callers of btf_parse_fields() and btf_record_free() decide whether or > >> not to change refcount should be enough. Besides, I personally would > >> like to keep individual btf in btf_field_kptr and > >> btf_field_graph_root, so that later we can have special fields > >> referencing different btf. > > > > Sorry, I didn't express the rough idea clearly enough. I didn't mean to > > move btf of btf_field_kptr and btf_field_graph_root to btf_record, > > because there are other btf-s which are different with the btf which > > creates the struct_meta_tab. What I was trying to suggest is to save one > > btf in btf_record and hope it will simplify the pin and the unpin of btf > > in btf_record: > > > > 1) save the btf which owns the btf_record in btf_record. > > 2) during btf_parse_kptr() or similar, if the used btf is the same as > > the btf in btf_record, there is no need to pin the btf > > I assume the used btf is the one that btf_parse is working on. > > > 3) when freeing the btf_record, if the btf saved in btf_field is the > > same as the btf in btf_record, there is no need to put it > > For btf_field_kptr.btf, is it the same as testing the btf_field_kptr.btf is > btf_is_kernel() or not? How about only does btf_get/put for btf_is_kernel()? > IIUC. It will not be the same. For a map referencing prog btf, I suppose we should still do btf_get(). I think the core idea is since a btf_record and the prog btf containing it has the same life time, we don't need to btf_get()/btf_put() in btf_parse_kptr()/btf_record_free() when a btf_field_kptr.btf is referencing itself. However, since btf_parse_kptr() called from btf_parse() and map_check_btf() all use prog btf, we need a way to differentiate the two. Hence Hou suggested saving the owner's btf in btf_record, and then check if btf_record->btf is the same as the btf_field_kptr.btf in btf_parse_kptr()/btf_record_free(). > > > > > For step 2) and step 3), however I think it is also doable through other > > ways (e.g., pass the btf to btf_record_free or similar).