On 12/8/23 8:45 AM, Yonghong Song wrote:
On 12/8/23 12:16 AM, Martin KaFai Lau wrote:
On 12/7/23 7:59 PM, Yonghong Song wrote:
I am trying to avoid making a special case for "bool has_btf_ref;" and "bool
from_map_check". It seems to a bit too much to deal with the error path for
btf_parse().
Would doing the refcount_set(&btf->refcnt, 1) earlier in btf_parse help?
No, it does not. The core reason is what Hao is mentioned in
https://lore.kernel.org/bpf/47ee3265-23f7-2130-ff28-27bfaf3f7877@xxxxxxxxxxxxxxx/
We simply cannot take btf reference if called from btf_parse().
Let us say we move refcount_set(&btf->refcnt, 1) earlier in btf_parse()
so we take ref for btf during btf_parse_fields(), then we have
btf_put <=== expect refcount == 0 to start the destruction process
...
btf_record_free <=== in which if graph_root, a btf reference will
be hold
so btf_put will never be able to actually free btf data.
ah. There is a loop like btf->struct_meta_tab->...btf.
Yes, the kasan problem will be resolved but we leak memory.
It is also unnecessary to take a reference since the value_rec is
referring to a record in struct_meta_tab.
If we optimize for not taking a refcnt, how about not taking a refcnt for
all cases and postpone the btf_put(), instead of taking refcnt in one case
but not another. Like your fix in v1. The failed selftest can be changed or
even removed if it does not make sense anymore.
After a couple of iterations, I think taking necessary reference approach
sounds better
and this will be consistent with how kptr is handled. For kptr, btf_parse
will ignore it.
Got it. It is why kptr.btf got away with the loop.
On the other hand, am I reading it correctly that kptr.btf only needs to take
the refcnt for btf that is btf_is_kernel()?
No. besides vmlinux and module btf, it also takes reference for prog btf, see
static int btf_parse_kptr(const struct btf *btf, struct btf_field *field,
struct btf_field_info *info)
{
...
if (id == -ENOENT) {
/* btf_parse_kptr should only be called w/ btf = program BTF */
WARN_ON_ONCE(btf_is_kernel(btf));
/* Type exists only in program BTF. Assume that it's a MEM_ALLOC
* kptr allocated via bpf_obj_new
*/
field->kptr.dtor = NULL;
id = info->kptr.type_id;
kptr_btf = (struct btf *)btf;
btf_get(kptr_btf);
I meant only kernel/module btf needs to take the refcnt, so there is no need to
take the refcnt here for the (it)self btf. Sorry that I was not clear in my
earlier comment.
The record is capturing something either in the self btf or something in the
kernel btf. The field->kptr.kptr is the one that may either point to a kernel or
self btf, so it should be the only case that needs to check the following in
btf_record_free():
if (btf_is_kernel(rec->fields[i].kptr.btf))
btf_put(rec->fields[i].kptr.btf);
All other cases the record has a self btf (including field->graph_root.btf). The
owner (map here) needs to ensure the self btf is freed after the record is freed.
I was thinking if it can avoid doing different things based on where
btf_parse_fields() is called by separating what type of btf always needs refcnt
or not. Agree the approach in this patch will fix the issue also and I have
acked v5. Thanks for the fix.
goto found_dtor;
}
...
}
Unfortunately, for graph_root (list_head, rb_root), btf_parse and map_check
will both
process it and that adds a little bit complexity.
Alexei also suggested the same taking reference approach:
https://lore.kernel.org/bpf/CAADnVQL+uc6VV65_Ezgzw3WH=ME9z1Fdy8Pd6xd0oOq8rgwh7g@xxxxxxxxxxxxxx/