Lorenzo Bianconi <lorenzo.bianconi@xxxxxxxxxx> writes: >> >> Hi everyone >> >> There seems to be some issue with BTF mismatch when trying to run the >> bpf_ct_set_nat_info() kfunc from a module. I was under the impression >> that this is supposed to work, so is there some kind of BTF dedup issue >> here or something? >> >> Steps to reproduce: >> >> 1. Compile kernel with nf_conntrack built-in and run selftests; >> './test_progs -a bpf_nf' works >> >> 2. Change the kernel config so nf_conntrack is build as a module >> >> 3. Start the test kernel and manually modprobe nf_conntrack and nf_nat >> >> 4. Run ./test_progs -a bpf_nf; this now fails with an error like: >> >> kernel function bpf_ct_set_nat_info args#0 expected pointer to STRUCT nf_conn___init but R1 has a pointer to STRUCT nf_conn___init > > This week Kumar and I took a look at this issue and we ended up > identifying a duplication of nf_conn___init structure. In particular: > > [~/workspace/bpf-next]$ bpftool btf --base-btf vmlinux dump file > net/netfilter/nf_conntrack.ko format raw | grep nf_conn__ > [110941] STRUCT 'nf_conn___init' size=248 vlen=1 > [~/workspace/bpf-next]$ bpftool btf --base-btf vmlinux dump file > net/netfilter/nf_nat.ko format raw | grep nf_conn__ > [107488] STRUCT 'nf_conn___init' size=248 vlen=1 > > Is it the root cause of the problem? It certainly seems to be related to it, at least. Amending the log message to include the BTF object IDs of the two versions shows that the register has a reference to nf_conn__init in nf_conntrack.ko, while the kernel expects it to point to nf_nat.ko. Not sure what's the right fix for this? Should libbpf be smart enough to pull the kfunc arg ID from the same BTF ID as the function itself? Or should the kernel compare structs and allow things if they're identical? Andrii, WDYT? -Toke