On Fri, Dec 31, 2021 at 09:15:07AM +0530, Kumar Kartikeya Dwivedi wrote: > On Fri, Dec 31, 2021 at 07:53:07AM IST, Alexei Starovoitov wrote: > > On Thu, Dec 30, 2021 at 08:06:58AM +0530, Kumar Kartikeya Dwivedi wrote: > > > [...] > > > + > > > + /* Identify type */ > > > + symbol_name += pfx_size; > > > + if (!*symbol_name) { > > > + bpf_log(bdata->log, "incomplete kfunc btf_id_set specification: %s\n", orig_name); > > > + return -EINVAL; > > > + } > > > + for (i = 0; i < ARRAY_SIZE(kfunc_type_str); i++) { > > > + pfx_size = strlen(kfunc_type_str[i]); > > > + if (strncmp(symbol_name, kfunc_type_str[i], pfx_size)) > > > + continue; > > > + break; > > > + } > > > + if (i == ARRAY_SIZE(kfunc_type_str)) { > > > + bpf_log(bdata->log, "invalid type '%s' for kfunc_btf_id_set %s\n", symbol_name, > > > + orig_name); > > > + return -EINVAL; > > > + } > > > + type = i; > > > + > > > + return btf_populate_kfunc_sets(bdata->btf, bdata->log, hook, type, set); > > > > I really like the direction taken by patches 2 and 3. > > I think we can save the module_kallsyms_on_each_symbol loop though. > > The registration mechanism, like: > > register_kfunc_btf_id_set(&prog_test_kfunc_list, &bpf_testmod_kfunc_btf_set); > > doesn't have to be complete removed. > > It can replaced with a sequence of calls: > > btf_populate_kfunc_sets(btf, hook, type, set); > > from __init of the module. > > The module knows its 'hook' and 'type' and set==&bpf_testmod_kfunc_btf_set. > > The bpf_testmod_init() can call btf_populate_kfunc_sets() multiple > > times to popualte sets into different hooks and types. > > There is no need to 'unregister' any more. > > And the patch 1 will no longer be necessary, since we don't need to iterate > > every symbol of the module with module_kallsyms_on_each_symbol(). > > No need to standardize on the prefix and kfunc_[hook|type]_str, > > though it's probably good idea anyway across module BTF sets. > > The main disadvantage is that we lose 'log' in btf_populate_kfunc_sets(), > > since __init of the module cannot have verifier log at that point. > > But it can stay as 'ret = -E2BIG;' without bpf_log() and module registration > > will fail in such case or we just warn inside __init if btf_populate_kfunc_sets > > fails in the rare case. > > wdyt? > > > > Sounds good, I'll make this change in the next version. Should I also drop > kallsyms_on_each_symbol for vmlinux BTF? I think we can use initcall for it too, > right? Yep. Of course. For consistency. > > > +} > > > + > > > +static int btf_parse_kfunc_sets(struct btf *btf, struct module *mod, > > > + struct bpf_verifier_log *log) > > > +{ > > > + struct btf_parse_kfunc_data data = { .btf = btf, .log = log, }; > > > + struct btf_kfunc_set_tab *tab; > > > + int hook, type, ret; > > > + > > > + if (!btf_is_kernel(btf)) > > > + return -EINVAL; > > > + if (WARN_ON_ONCE(btf_is_module(btf) && !mod)) { > > > + bpf_log(log, "btf internal error: no module for module BTF %s\n", btf->name); > > > + return -EFAULT; > > > + } > > > + if (mod) > > > + ret = module_kallsyms_on_each_symbol(mod, btf_parse_kfunc_sets_cb, &data); > > > + else > > > + ret = kallsyms_on_each_symbol(btf_parse_kfunc_sets_cb, &data); > > > + > > > + tab = btf->kfunc_set_tab; > > > + if (!ret && tab) { > > > + /* Sort all populated sets */ > > > + for (hook = 0; hook < ARRAY_SIZE(tab->sets); hook++) { > > > + for (type = 0; type < ARRAY_SIZE(tab->sets[0]); type++) { > > > + struct btf_id_set *set = tab->sets[hook][type]; > > > + > > > + /* Not all sets may be populated */ > > > + if (!set) > > > + continue; > > > + sort(set->ids, set->cnt, sizeof(set->ids[0]), btf_kfunc_ids_cmp, > > > + NULL); > > > > Didn't resolve_btfid store ids already sorted? > > Why another sort is needed? > > Because it might be possible while iterating over symbols (be it vmlinux or > module), we combine sets like [1, 4, 6] and [2, 3, 5] into [1, 4, 6, 2, 3, 5], > into the set for a certain hook, type, so to enable bsearch we do one final sort > after possible sets have been populated. > > > Because btf_populate_kfunc_sets() can concatenate the sets? > > But if we let __init call it directly the module shouldn't use > > the same hook/type combination multiple times with different sets. > > So no secondary sorting will be necessary? > > > > Yes, if we make it that only one call per hook/type can be done, then this > shouldn't be needed, but e.g. if each file has a set for some hook and uses late > initcall to do registration, then it will be needed again for the same reason. You mean when module has multiple files and these sets have to be in different files? > We can surely catch the second call (see if tab->[hook][type] != NULL). Good idea. Let's do this for now. One .c per module with such sets is fine, imo. > > > This commit prepares the BTF parsing functions for vmlinux and module > > > BTFs to find all kfunc BTF ID sets from the vmlinux and module symbols > > > and concatentate all sets into single unified set which is sorted and > > > keyed by the 'hook' it is meant for, and 'type' of set. > > > > 'sorted by hook' ? > > The btf_id_set_contains() need to search it by 'id', so it's sorted by 'id'. > > Yeah, it needs a comma after 'sorted' :). > > > Is it because you're adding mod's IDs to vmlinux IDs and re-sorting them? > > No, I'm not mixing them. We only add same module's/vmlinux's IDs to its struct > btf, then sort each set (for hook,type pair). find_kfunc_desc_btf gives us the > BTF, then we can directly do what is essentially a single btf_id_set_contains > call, so it is not required to research in vmlinux BTF. The BTF associated with > the kfunc call is known. Got it. Ok.