On Mon, May 24, 2021 at 3:58 PM Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote: > > On Thu, May 20, 2021 at 10:31 PM Andrii Nakryiko > <andrii.nakryiko@xxxxxxxxx> wrote: > > > > On Wed, May 19, 2021 at 7:19 AM Michal Suchánek <msuchanek@xxxxxxx> wrote: > > > > > > Hello, > > > > > > linux-next fails to boot for me: > > > > > > [ 0.000000] Linux version 5.13.0-rc2-next-20210519-1.g3455ff8-vanilla (geeko@buildhost) (gcc (SUSE Linux) 10.3.0, GNU ld (GNU Binutils; > > > openSUSE Tumbleweed) 2.36.1.20210326-3) #1 SMP Wed May 19 10:05:10 UTC 2021 (3455ff8) > > > [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.13.0-rc2-next-20210519-1.g3455ff8-vanilla root=UUID=ec42c33e-a2c2-4c61-afcc-93e9527 > > > 8f687 plymouth.enable=0 resume=/dev/disk/by-uuid/f1fe4560-a801-4faf-a638-834c407027c7 mitigations=auto earlyprintk initcall_debug nomodeset > > > earlycon ignore_loglevel console=ttyS0,115200 > > > ... > > > [ 26.093364] calling tracing_set_default_clock+0x0/0x62 @ 1 > > > [ 26.098937] initcall tracing_set_default_clock+0x0/0x62 returned 0 after 0 usecs > > > [ 26.106330] calling acpi_gpio_handle_deferred_request_irqs+0x0/0x7c @ 1 > > > [ 26.113033] initcall acpi_gpio_handle_deferred_request_irqs+0x0/0x7c returned 0 after 3 usecs > > > [ 26.121559] calling clk_disable_unused+0x0/0x102 @ 1 > > > [ 26.126620] initcall clk_disable_unused+0x0/0x102 returned 0 after 0 usecs > > > [ 26.133491] calling regulator_init_complete+0x0/0x25 @ 1 > > > [ 26.138890] initcall regulator_init_complete+0x0/0x25 returned 0 after 0 usecs > > > [ 26.147816] Freeing unused decrypted memory: 2036K > > > [ 26.153682] Freeing unused kernel image (initmem) memory: 2308K > > > [ 26.165776] Write protecting the kernel read-only data: 26624k > > > [ 26.173067] Freeing unused kernel image (text/rodata gap) memory: 2036K > > > [ 26.180416] Freeing unused kernel image (rodata/data gap) memory: 1184K > > > [ 26.187031] Run /init as init process > > > [ 26.190693] with arguments: > > > [ 26.193661] /init > > > [ 26.195933] with environment: > > > [ 26.199079] HOME=/ > > > [ 26.201444] TERM=linux > > > [ 26.204152] BOOT_IMAGE=/boot/vmlinuz-5.13.0-rc2-next-20210519-1.g3455ff8-vanilla > > > [ 26.254154] BPF: type_id=35503 offset=178440 size=4 > > > [ 26.259125] BPF: > > > [ 26.261054] BPF:Invalid offset > > > [ 26.264119] BPF: > > > > It took me a while to reliably bisect this, but it clearly points to > > this commit: > > > > e481fac7d80b ("mm/page_alloc: convert per-cpu list protection to local_lock") > > > > One commit before it, 676535512684 ("mm/page_alloc: split per cpu page > > lists and zone stats -fix"), works just fine. > > > > I'll have to spend more time debugging what exactly is happening, but > > the immediate problem is two different definitions of numa_node > > per-cpu variable. They both are at the same offset within > > .data..percpu ELF section, they both have the same name, but one of > > them is marked as static and another as global. And one is int > > variable, while another is struct pagesets. I'll look some more > > tomorrow, but adding Jiri and Arnaldo for visibility. > > > > [110907] DATASEC '.data..percpu' size=178904 vlen=303 > > ... > > type_id=27753 offset=163976 size=4 (VAR 'numa_node') > > type_id=27754 offset=163976 size=4 (VAR 'numa_node') > > > > [27753] VAR 'numa_node' type_id=27556, linkage=static > > [27754] VAR 'numa_node' type_id=20, linkage=global > > > > [20] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED > > > > [27556] STRUCT 'pagesets' size=0 vlen=1 > > 'lock' type_id=507 bits_offset=0 > > > > [506] STRUCT '(anon)' size=0 vlen=0 > > [507] TYPEDEF 'local_lock_t' type_id=506 > > > > So also something weird about those zero-sized struct pagesets and > > local_lock_t inside it. > > Ok, so nothing weird about them. local_lock_t is designed to be > zero-sized unless CONFIG_DEBUG_LOCK_ALLOC is defined. > > But such zero-sized per-CPU variables are confusing pahole during BTF > generation, as now two different variables "occupy" the same address. FWIW, here's the pahole fix (it tried to filter zero-sized per-CPU vars, but not quite completely). [0] https://lore.kernel.org/bpf/20210524234222.278676-1-andrii@xxxxxxxxxx/T/#u > > Given this seems to be the first zero-sized per-CPU variable, I wonder > if it would be ok to make sure it's never zero-sized, while pahole > gets fixed and it's latest version gets widely packaged and > distributed. > > Mel, what do you think about something like below? Or maybe you can > advise some better solution? > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 41b87d6f840c..6a1d7511cae9 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -124,6 +124,13 @@ static DEFINE_MUTEX(pcp_batch_high_lock); > > struct pagesets { > local_lock_t lock; > +#if defined(CONFIG_DEBUG_INFO_BTF) && !defined(CONFIG_DEBUG_LOCK_ALLOC) > + /* pahole 1.21 and earlier gets confused by zero-sized per-CPU > + * variables and produces invalid BTF. So to accommodate earlier > + * versions of pahole, ensure that sizeof(struct pagesets) is never 0. > + */ > + char __filler; > +#endif > }; > static DEFINE_PER_CPU(struct pagesets, pagesets) = { > .lock = INIT_LOCAL_LOCK(lock), > > > > > > [ 26.264119] > > > [ 26.267437] failed to validate module [efivarfs] BTF: -22 > > > [ 26.316724] systemd[1]: systemd 246.13+suse.105.g14581e0120 running in system mode. (+PAM +AUDIT +SELINUX -IMA +APPARMOR -SMACK +SYSVINI > > > T +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +ZSTD +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=unified) > > > [ 26.357990] systemd[1]: Detected architecture x86-64. > > > [ 26.363068] systemd[1]: Running in initial RAM disk. > > > > > > > [...]