On 06/11/2024 16:08, Laura Nao wrote: > Hello, > > KernelCI has detected a module loading regression affecting all AMD and > Intel Chromebooks in the Collabora LAVA lab, occurring between > next-20241024 and next-20241025. > > The logs indicate a failure in BTF module validation, preventing all > modules from loading correctly (with CONFIG_MODULE_ALLOW_BTF_MISMATCH > unset). The example below is from an AMD Chromebook (HP 14b na0052xx), > with similar errors observed on other AMD and Intel devices: > > [ 5.284373] failed to validate module [cros_kbd_led_backlight] BTF: -22 > [ 5.291392] failed to validate module [i2c_hid] BTF: -22 > [ 5.293958] failed to validate module [chromeos_pstore] BTF: -22 > [ 5.302832] failed to validate module [coreboot_table] BTF: -22 > [ 5.309175] failed to validate module [raydium_i2c_ts] BTF: -22 > [ 5.309264] failed to validate module [i2c_cros_ec_tunnel] BTF: -22 > [ 5.322158] failed to validate module [typec] BTF: -22 > [ 5.327554] failed to validate module [snd_timer] BTF: -22 > [ 5.327573] failed to validate module [cros_usbpd_notify] BTF: -22 > [ 5.339272] failed to validate module [elan_i2c] BTF: -22 > [ 5.345821] failed to validate module [industrialio] BTF: -22 > [ 5.423113] failed to validate module [cfg80211] BTF: -22 > [ 5.443074] failed to validate module [cros_ec_dev] BTF: -22 > [ 5.448857] failed to validate module [snd_pci_acp3x] BTF: -22 > [ 5.454736] failed to validate module [cros_kbd_led_backlight] BTF: -22 > [ 5.461458] failed to validate module [regmap_i2c] BTF: -22 > [ 5.470228] failed to validate module [i2c_piix4] BTF: -22 > [ 5.491123] failed to validate module [i2c_hid] BTF: -22 > [ 5.491226] failed to validate module [chromeos_pstore] BTF: -22 > [ 5.496519] failed to validate module [coreboot_table] BTF: -22 > [ 5.502632] failed to validate module [snd_timer] BTF: -22 > [ 5.538916] failed to validate module [gsmi] BTF: -22 > [ 5.604971] failed to validate module [mii] BTF: -22 > [ 5.604971] failed to validate module [videobuf2_common] BTF: -22 > [ 5.604972] failed to validate module [sp5100_tco] BTF: -22 > [ 5.616068] failed to validate module [snd_soc_acpi] BTF: -22 > [ 5.680553] failed to validate module [bluetooth] BTF: -22 > [ 5.749320] failed to validate module [chromeos_pstore] BTF: -22 > [ 5.755440] failed to validate module [mii] BTF: -22 > [ 5.760522] failed to validate module [snd_timer] BTF: -22 > [ 5.783549] failed to validate module [bluetooth] BTF: -22 > [ 5.841561] failed to validate module [mii] BTF: -22 > [ 5.846699] failed to validate module [snd_timer] BTF: -22 > [ 5.892444] failed to validate module [mii] BTF: -22 > [ 5.897708] failed to validate module [snd_timer] BTF: -22 > [ 5.945507] failed to validate module [snd_timer] BTF: -22 > > The full kernel log is available on [1]. The config used is available on > [2] and the kernel/modules have been built using gcc-12. > > The issue is still present on next-20241105. > > I'm sending this report to track the regression while a fix is > identified. The culprit commit hasn't been pinpointed yet, I'll report > back once it's identified. > > Any feedback or suggestion for additional debugging steps would be greatly > appreciated. > > Best, > Thanks for the report! Judging from the config, you're seeing this with pahole v1.24. I have seen issues like this in the past where during a kernel build, module BTF has been built against vmlinux BTF, and then something later re-triggers vmlinux BTF generation. If that re-triggered vmlinux BTF does not use the same type ids for types, this can result in mismatch errors as above since modules are referring to out-of-date type ids in vmlinux. That's just a preliminary guess though, we'll need more info to help get to the bottom of this. A few suggestions to help debug this: - if you have build logs, check BTF generation of vmlinux. Did it in fact happen twice perhaps? Even better if, if kernel CI saves logs, feel free to send a pointer and I'll take a look. - can you post the vmlinux (stripped of DWARF data if possible to limit size) and one of the failing modules somewhere so we can analyze? - Failing that, bpftool btf dump file /path/2/vmlinux_from_build > vmlinux.raw and upload of the vmlinux.raw and one of the failing module .kos would help. I've tried to reproduce this; no luck so far at my end. Alan > Laura > > [1] https://pastebin.com/raw/dtvzBkxh > [2] https://pastebin.com/raw/a1MGi3wH > > #regzbot introduced: next-20241024..next-20241025 > >