On Sun, Nov 13, 2022 at 07:04:43PM +0100, Toke Høiland-Jørgensen wrote: > Lorenzo Bianconi <lorenzo.bianconi@xxxxxxxxxx> writes: > > >> > >> Hi everyone > >> > >> There seems to be some issue with BTF mismatch when trying to run the > >> bpf_ct_set_nat_info() kfunc from a module. I was under the impression > >> that this is supposed to work, so is there some kind of BTF dedup issue > >> here or something? > >> > >> Steps to reproduce: > >> > >> 1. Compile kernel with nf_conntrack built-in and run selftests; > >> './test_progs -a bpf_nf' works > >> > >> 2. Change the kernel config so nf_conntrack is build as a module > >> > >> 3. Start the test kernel and manually modprobe nf_conntrack and nf_nat > >> > >> 4. Run ./test_progs -a bpf_nf; this now fails with an error like: > >> > >> kernel function bpf_ct_set_nat_info args#0 expected pointer to STRUCT nf_conn___init but R1 has a pointer to STRUCT nf_conn___init > > > > This week Kumar and I took a look at this issue and we ended up > > identifying a duplication of nf_conn___init structure. In particular: > > > > [~/workspace/bpf-next]$ bpftool btf --base-btf vmlinux dump file > > net/netfilter/nf_conntrack.ko format raw | grep nf_conn__ > > [110941] STRUCT 'nf_conn___init' size=248 vlen=1 > > [~/workspace/bpf-next]$ bpftool btf --base-btf vmlinux dump file > > net/netfilter/nf_nat.ko format raw | grep nf_conn__ > > [107488] STRUCT 'nf_conn___init' size=248 vlen=1 > > > > Is it the root cause of the problem? > > It certainly seems to be related to it, at least. Amending the log > message to include the BTF object IDs of the two versions shows that the > register has a reference to nf_conn__init in nf_conntrack.ko, while the kernel > expects it to point to nf_nat.ko. > > Not sure what's the right fix for this? Should libbpf be smart enough to > pull the kfunc arg ID from the same BTF ID as the function itself? Or Verifier chose the ID based on where the variable was allocated, which is bpf_(skb|xdp)_ct_alloc() from nf_conntrack.ko. Assuming btf id based on usage location might be error prone. > should the kernel compare structs and allow things if they're identical? > Andrii, WDYT? This works but might make verifier slower. diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 195d24316750..562d2c15906d 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -24,6 +24,7 @@ #include <linux/bpf_lsm.h> #include <linux/btf_ids.h> #include <linux/poison.h> +#include "../tools/lib/bpf/relo_core.h" #include "disasm.h" @@ -8236,7 +8237,8 @@ static int process_kf_arg_ptr_to_btf_id(struct bpf_verifier_env *env, reg_ref_t = btf_type_skip_modifiers(reg_btf, reg_ref_id, ®_ref_id); reg_ref_tname = btf_name_by_offset(reg_btf, reg_ref_t->name_off); - if (!btf_struct_ids_match(&env->log, reg_btf, reg_ref_id, reg->off, meta->btf, ref_id, strict_type_match)) { + if (!btf_struct_ids_match(&env->log, reg_btf, reg_ref_id, reg->off, meta->btf, ref_id, strict_type_match) && \ + !__bpf_core_types_match(reg_btf, reg_ref_id, meta->btf, ref_id, false, 10)) { verbose(env, "kernel function %s args#%d expected pointer to %s %s but R%d has a pointer to %s %s\n", meta->func_name, argno, btf_type_str(ref_t), ref_tname, argno + 1, btf_type_str(reg_ref_t), reg_ref_tname); -- Regards, Artem