Re: WARN: multiple IDs found for 'nf_conn': 92168, 117897 - using 92168

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 14/10/2022 07:47, Jiri Olsa wrote:
> On Thu, Oct 13, 2022 at 03:24:59PM -0700, Andrii Nakryiko wrote:
>> On Thu, Oct 13, 2022 at 3:12 PM Jiri Olsa <olsajiri@xxxxxxxxx> wrote:
>>>
>>> On Thu, Oct 13, 2022 at 08:05:17AM -0700, Jakub Kicinski wrote:
>>>> On Wed, 5 Oct 2022 22:07:57 +0200 Jiri Olsa wrote:
>>>>>> Yeah, it's there on linux-next, too.
>>>>>>
>>>>>> Let me grab a fresh VM and try there. Maybe it's my system. Somehow.
>>>>>
>>>>> ok, I will look around what's the way to install that centos 8 thing
>>>>
>>>> Any luck?
>>>
>>> now BTFIDS warnings..
>>>
>>> I can see following on centos8 with gcc 8.5:
>>>
>>>           BTFIDS  vmlinux
>>>         WARN: multiple IDs found for 'task_struct': 300, 56614 - using 300
>>>         WARN: multiple IDs found for 'file': 540, 56649 - using 540
>>>         WARN: multiple IDs found for 'vm_area_struct': 549, 56652 - using 549
>>>         WARN: multiple IDs found for 'seq_file': 953, 56690 - using 953
>>>         WARN: multiple IDs found for 'inode': 1132, 56966 - using 1132
>>>         WARN: multiple IDs found for 'path': 1164, 56995 - using 1164
>>>         WARN: multiple IDs found for 'task_struct': 300, 61905 - using 300
>>>         WARN: multiple IDs found for 'file': 540, 61943 - using 540
>>>         WARN: multiple IDs found for 'vm_area_struct': 549, 61946 - using 549
>>>         WARN: multiple IDs found for 'inode': 1132, 62029 - using 1132
>>>         WARN: multiple IDs found for 'path': 1164, 62058 - using 1164
>>>         WARN: multiple IDs found for 'cgroup': 1190, 62067 - using 1190
>>>         WARN: multiple IDs found for 'seq_file': 953, 62253 - using 953
>>>         WARN: multiple IDs found for 'sock': 7960, 62374 - using 7960
>>>         WARN: multiple IDs found for 'sk_buff': 1876, 62485 - using 1876
>>>         WARN: multiple IDs found for 'bpf_prog': 6094, 62542 - using 6094
>>>         WARN: multiple IDs found for 'socket': 7993, 62545 - using 7993
>>>         WARN: multiple IDs found for 'xdp_buff': 6191, 62836 - using 6191
>>>         WARN: multiple IDs found for 'sock_common': 8164, 63152 - using 8164
>>>         WARN: multiple IDs found for 'request_sock': 17296, 63204 - using 17296
>>>         WARN: multiple IDs found for 'inet_request_sock': 36292, 63222 - using 36292
>>>         WARN: multiple IDs found for 'inet_sock': 32700, 63225 - using 32700
>>>         WARN: multiple IDs found for 'inet_connection_sock': 33944, 63240 - using 33944
>>>         WARN: multiple IDs found for 'tcp_request_sock': 36299, 63260 - using 36299
>>>         WARN: multiple IDs found for 'tcp_sock': 33969, 63264 - using 33969
>>>         WARN: multiple IDs found for 'bpf_map': 6623, 63343 - using 6623
>>>
>>> I'll need to check on that..
>>>
>>> and I just actually saw the 'nf_conn' warning on linux-next/master with
>>> latest fedora/gcc-12:
>>>
>>>           BTF [M] net/netfilter/nf_nat.ko
>>>         WARN: multiple IDs found for 'nf_conn': 106518, 120156 - using 106518
>>>         WARN: multiple IDs found for 'nf_conn': 106518, 121853 - using 106518
>>>         WARN: multiple IDs found for 'nf_conn': 106518, 123126 - using 106518
>>>         WARN: multiple IDs found for 'nf_conn': 106518, 124537 - using 106518
>>>         WARN: multiple IDs found for 'nf_conn': 106518, 126442 - using 106518
>>>         WARN: multiple IDs found for 'nf_conn': 106518, 128256 - using 106518
>>>           LD [M]  net/netfilter/nf_nat_tftp.ko
>>>
>>> looks like maybe dedup missed this struct for some reason
>>>
>>> nf_conn dump from module:
>>>
>>>         [120155] PTR '(anon)' type_id=120156
>>>         [120156] STRUCT 'nf_conn' size=320 vlen=14
>>>                 'ct_general' type_id=105882 bits_offset=0
>>>                 'lock' type_id=180 bits_offset=64
>>>                 'timeout' type_id=113 bits_offset=640
>>>                 'zone' type_id=106520 bits_offset=672
>>>                 'tuplehash' type_id=106533 bits_offset=704
>>>                 'status' type_id=1 bits_offset=1600
>>>                 'ct_net' type_id=3215 bits_offset=1664
>>>                 'nat_bysource' type_id=139 bits_offset=1728
>>>                 '__nfct_init_offset' type_id=949 bits_offset=1856
>>>                 'master' type_id=120155 bits_offset=1856
>>>                 'mark' type_id=106351 bits_offset=1920
>>>                 'secmark' type_id=106351 bits_offset=1952
>>>                 'ext' type_id=106536 bits_offset=1984
>>>                 'proto' type_id=106532 bits_offset=2048
>>>
>>> nf_conn dump from vmlinux:
>>>
>>>         [106517] PTR '(anon)' type_id=106518
>>>         [106518] STRUCT 'nf_conn' size=320 vlen=14
>>>                 'ct_general' type_id=105882 bits_offset=0
>>>                 'lock' type_id=180 bits_offset=64
>>>                 'timeout' type_id=113 bits_offset=640
>>>                 'zone' type_id=106520 bits_offset=672
>>>                 'tuplehash' type_id=106533 bits_offset=704
>>>                 'status' type_id=1 bits_offset=1600
>>>                 'ct_net' type_id=3215 bits_offset=1664
>>>                 'nat_bysource' type_id=139 bits_offset=1728
>>>                 '__nfct_init_offset' type_id=949 bits_offset=1856
>>>                 'master' type_id=106517 bits_offset=1856
>>>                 'mark' type_id=106351 bits_offset=1920
>>>                 'secmark' type_id=106351 bits_offset=1952
>>>                 'ext' type_id=106536 bits_offset=1984
>>>                 'proto' type_id=106532 bits_offset=2048
>>>
>>> look identical.. Andrii, any idea?
>>
>> I'm pretty sure they are not identical. There is somewhere a STRUCT vs
>> FWD difference. We had a similar discussion recently with Alan
>> Maguire.
>>
>>>                 'master' type_id=120155 bits_offset=1856
>>
>> vs
>>
>>>                 'master' type_id=106517 bits_offset=1856
> 
> master is pointer to same 'nf_conn' object, and rest of the ids are same
> 
> jirka
> 

I tried digging into this problem a bit - in my case I was seeing 
"struct sk_buff" duplicated in kernel/module BTF. Here's what I found..

Consider a situation like this, where one header file defining a struct s1
has a pointer field, pointing at struct s2. But struct s2 is a fwd definition.

$ cat s1.h
#include <stdio.h>
struct s2;

struct s1 {
        struct s1 *f1;
        struct s2 *f2;
};

$ cat s1.c
#include "s1.h"

int main(int argc, char *argv[])
{
        struct s1 s1;

        return 0;
}

Now consider a separate program s2, that #includes definitions for both
s1 and s2:

$ cat s2.h
#include <stdio.h>

struct s1;

struct s2 {
        struct s1 *f1;
};

$cat s2.c

#include "s2.h"
#include "s1.h"

int main(int argc, char *argv[])
{
	struct s1 s1 = {};
	struct s2 s2 = {};

	return 0;

}

In this case the generated base BTF contains a definition for s1,
and a FWD for s2, but the "module" BTF for s2 contains a full
definition for s2, so the dedup fails:
 
$ bpftool btf dump file s1
[29] STRUCT 's1' size=16 vlen=2
	'f1' type_id=30 bits_offset=0
	'f2' type_id=32 bits_offset=64
[30] PTR '(anon)' type_id=29
[31] FWD 's2' fwd_kind=struct

$ bpftool btf dump -B s1 file s2
[36] STRUCT 's2' size=8 vlen=1
	'f1' type_id=38 bits_offset=0
[37] STRUCT 's1' size=16 vlen=2
	'f1' type_id=38 bits_offset=0
	'f2' type_id=39 bits_offset=64
[38] PTR '(anon)' type_id=37
[39] PTR '(anon)' type_id=36


So we had to redefine struct s1 in the "module" because the
FWD wasn't resolved in the base BTF. This is by design as I
understand it; in effect we can't supplement base BTF with 
info we've gotten from module BTF about forward resolution
(at least that's my understanding of the reason).

Now does this sort of thing happen in the kernel? It looks like 
it; consider struct nf_conn; it contains a possible_net_t:

typedef struct {
	struct net *               net;                  /*     0     8 */

	/* size: 8, cachelines: 1, members: 1 */
	/* last cacheline: 8 bytes */
} possible_net_t;

...and a struct net * contains pointers to structures
that aren't in the vmlinux BTF (because they are
in modules); for example:

	struct netns_ipvs *        ipvs;                 /*  3912     8 */

$ pahole netns_ipvs
pahole: type 'netns_ipvs' not found

...and in vmlinux BTF it is:

[2983] FWD 'netns_ipvs' fwd_kind=struct
[2984] PTR '(anon)' type_id=2983

...and in struct net we can see the fwd type is referenced alright:

[2021] STRUCT 'net' size=4288 vlen=52
...
        'ipvs' type_id=2984 bits_offset=31808

So we'd expect any ipvs-related modules to not dedup
struct net, since they'll have the full definition
for netns_ipvs. In xt_ipvs.ko we see:

[111924] STRUCT 'netns_ipvs' size=2176 vlen=78
        'gen' type_id=21 bits_offset=0
        'enable' type_id=21 bits_offset=32
        'rs_table' type_id=4044 bits_offset=64
        'app_list' type_id=83 bits_offset=1088

...and when we look at 'struct net' we see:

[111786] STRUCT 'net' size=4288 vlen=52
...
     'ipvs' type_id=111925 bits_offset=31808

And then if we don't dedup struct net, it seems likely
that structures referencing struct net (like skbs,
nf_conn etc) won't dedup either since they'll point
at "their" version of struct net.
	
Not sure if that's the root cause here, but it seems
like it is happening in other modules at least.

More subtle effects are also possible I think; if a type
is in a header file is defined but not referenced anywhere
(as might well happen for a module-related type in vmlinux),
it might not always make it into the DWARF description,
and as a result of that might not have a BTF
representation.

Alan



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux