On Mon, Feb 19, 2024 at 11:01:16PM +0100, Erhard Furtner wrote: > Greetings! > > 'modprobe -v ttm-device-test' on my Ryzen 5950X amd64 box and on my Talos II (ppc64) leads to immediate list_add corruption. > > The machines stay useable via VNC but the issue seems to cause memory corruption which shows up later on when PAGE_POISONING is enabled: > > [...] > KTAP version 1 > 1..1 > KTAP version 1 > # Subtest: ttm_device > # module: ttm_device_test > 1..5 > ok 1 ttm_device_init_basic > # ttm_device_init_multiple: ASSERTION FAILED at drivers/gpu/drm/ttm/tests/ttm_device_test.c:68 > Expected list_count_nodes(&ttm_devs[0].device_list) == num_dev, but > list_count_nodes(&ttm_devs[0].device_list) == 4 (0x4) > num_dev == 3 (0x3) > not ok 2 ttm_device_init_multiple > list_add corruption. prev->next should be next (ffffffffc05cd428), but was 6b6b6b6b6b6b6b6b. (prev=ffffa0b1a5c034f0). > ------------[ cut here ]------------ > kernel BUG at lib/list_debug.c:32! > invalid opcode: 0000 [#1] SMP NOPTI > CPU: 6 PID: 2129 Comm: kunit_try_catch Tainted: G N 6.7.5-Zen3 #1 > Hardware name: To Be Filled By O.E.M. B550M Pro4/B550M Pro4, BIOS P3.40 01/18/2024 > RIP: 0010:__list_add_valid_or_report+0x67/0x9c > Code: c7 c7 26 ff c4 90 48 89 c6 e8 2f 32 ca ff 0f 0b 4c 8b 02 49 39 f0 74 14 48 89 d1 48 c7 c7 78 ff c4 90 4c 89 c2 e8 13 32 ca ff <0f> 0b 48 39 d7 74 05 4c 39 c7 75 17 48 89 f1 48 89 c2 48 89 fe 48 > RSP: 0018:ffffb23b05d27df8 EFLAGS: 00010246 > RAX: 0000000000000075 RBX: 0000000000000000 RCX: 0000000000000000 > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 > RBP: ffffa0b1a5c034f0 R08: 0000000000000000 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000000 R12: ffffa0b1843b2628 > R13: ffffa0b1b7c1f478 R14: ffffffffc0696480 R15: ffffa0b1a5c11000 > FS: 0000000000000000(0000) GS:ffffa0b85eb80000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00007ff09c005038 CR3: 000000026ce14000 CR4: 0000000000b50ef0 > Call Trace: > <TASK> > ? __die_body+0x15/0x65 > ? die+0x2f/0x48 > ? do_trap+0x76/0x109 > ? __list_add_valid_or_report+0x67/0x9c > ? __list_add_valid_or_report+0x67/0x9c > ? do_error_trap+0x69/0xa6 > ? __list_add_valid_or_report+0x67/0x9c > ? exc_invalid_op+0x4d/0x71 > ? __list_add_valid_or_report+0x67/0x9c > ? asm_exc_invalid_op+0x1a/0x20 > ? __list_add_valid_or_report+0x67/0x9c > ? __list_add_valid_or_report+0x67/0x9c > ttm_device_init+0x10e/0x157 [ttm] > ttm_device_kunit_init+0x3d/0x51 [ttm_kunit_helpers] > ttm_device_fini_basic+0x6d/0x1b3 [ttm_device_test] > ? timekeeping_get_ns+0x19/0x3b > ? srso_alias_return_thunk+0x5/0xfbef5 > ? ktime_get_ts64+0x40/0x92 > kunit_try_run_case+0xaf/0x163 [kunit] > ? kunit_try_catch_throw+0x1b/0x1b [kunit] > ? kunit_try_catch_throw+0x1b/0x1b [kunit] > kunit_generic_run_threadfn_adapter+0x15/0x20 [kunit] > kthread+0xcf/0xd7 > ? kthread_complete_and_exit+0x1a/0x1a > ret_from_fork+0x23/0x35 > ? kthread_complete_and_exit+0x1a/0x1a > ret_from_fork_asm+0x11/0x20 > </TASK> > Modules linked in: ttm_device_test ttm_kunit_helpers drm_kunit_helpers kunit rfkill dm_crypt nhpoly1305_avx2 nhpoly1305 chacha_generic chacha_x86_64 libchacha adiantum libpoly1305 algif_skcipher input_leds joydev hid_generic usbhid hid amdgpu snd_hda_codec_hdmi amd64_edac snd_hda_intel amdxcp mfd_core snd_intel_dspcfg edac_mce_amd gpu_sched snd_hda_codec video snd_hwdep drm_suballoc_helper snd_hda_core i2c_algo_bit drm_ttm_helper snd_pcm wmi_bmof ttm snd_timer evdev drm_exec snd drm_display_helper soundcore kvm_amd k10temp drm_buddy rapl wmi gpio_amdpt gpio_generic button lz4 lz4_compress lz4_decompress zram sg nct6775 nct6775_core hwmon_vid hwmon loop configfs sha512_ssse3 sha512_generic sha256_ssse3 sha1_ssse3 sha1_generic aesni_intel libaes crypto_simd cryptd xhci_pci xhci_hcd ccp usbcore usb_common sunrpc dm_mod pkcs8_key_parser efivarfs > ---[ end trace 0000000000000000 ]--- > RIP: 0010:__list_add_valid_or_report+0x67/0x9c > Code: c7 c7 26 ff c4 90 48 89 c6 e8 2f 32 ca ff 0f 0b 4c 8b 02 49 39 f0 74 14 48 89 d1 48 c7 c7 78 ff c4 90 4c 89 c2 e8 13 32 ca ff <0f> 0b 48 39 d7 74 05 4c 39 c7 75 17 48 89 f1 48 89 c2 48 89 fe 48 > RSP: 0018:ffffb23b05d27df8 EFLAGS: 00010246 > RAX: 0000000000000075 RBX: 0000000000000000 RCX: 0000000000000000 > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 > RBP: ffffa0b1a5c034f0 R08: 0000000000000000 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000000 R12: ffffa0b1843b2628 > R13: ffffa0b1b7c1f478 R14: ffffffffc0696480 R15: ffffa0b1a5c11000 > FS: 0000000000000000(0000) GS:ffffa0b85eb80000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00007ff09c005038 CR3: 000000026ce14000 CR4: 0000000000b50ef0 > Key type dns_resolver registered > NFS: Registering the id_resolver key type > Key type id_resolver registered > Key type id_legacy registered > # ttm_device_fini_basic: try timed out > general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6b6b: 0000 [#2] SMP NOPTI > CPU: 26 PID: 2119 Comm: modprobe Tainted: G D N 6.7.5-Zen3 #1 > Hardware name: To Be Filled By O.E.M. B550M Pro4/B550M Pro4, BIOS P3.40 01/18/2024 > RIP: 0010:kthread_stop+0x3c/0x78 > Code: f0 0f c1 43 28 be 02 00 00 00 85 c0 74 0c 8d 50 01 09 c2 79 0a be 01 00 00 00 e8 f5 31 37 00 48 89 df e8 35 f1 ff ff 48 89 c5 <f0> 80 08 02 48 89 df e8 6a ff ff ff f0 80 4b 02 02 48 89 df e8 f6 > RSP: 0018:ffffb23b01fff938 EFLAGS: 00010246 > RAX: 6b6b6b6b6b6b6b6b RBX: ffffa0b170ab6040 RCX: 0000000000000000 > RDX: 000000006b6b6b6f RSI: 0000000000000002 RDI: 0000000000000000 > RBP: 6b6b6b6b6b6b6b6b R08: 0000000000000000 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000000 R12: ffffa0b170ab6040 > R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000000 > FS: 00007f9321e6ec40(0000) GS:ffffa0b85f080000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00005592ea51ef40 CR3: 0000000189590000 CR4: 0000000000b50ef0 > Call Trace: > <TASK> > ? __die_body+0x15/0x65 > ? die_addr+0x37/0x50 > ? exc_general_protection+0x1b6/0x1ec > ? asm_exc_general_protection+0x26/0x30 > ? kthread_stop+0x3c/0x78 > ? kthread_stop+0x39/0x78 > kunit_try_catch_run+0xc9/0x155 [kunit] > kunit_run_case_catch_errors+0x3f/0x93 [kunit] > kunit_run_tests+0x182/0x516 [kunit] > ? kunit_try_run_case_cleanup+0x39/0x39 [kunit] > ? kunit_catch_run_case_cleanup+0x85/0x85 [kunit] > __kunit_test_suites_init+0x64/0x83 [kunit] > kunit_module_notify+0xda/0x177 [kunit] > notifier_call_chain+0x5a/0x92 > blocking_notifier_call_chain+0x3e/0x60 > do_init_module+0xcb/0x218 > init_module_from_file+0x7a/0x99 > __do_sys_finit_module+0x162/0x223 > do_syscall_64+0x6e/0xd8 > entry_SYSCALL_64_after_hwframe+0x4b/0x53 > RIP: 0033:0x7f9321f7a479 > Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 87 89 0c 00 f7 d8 64 89 01 48 > RSP: 002b:00007ffe2e350908 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 > RAX: ffffffffffffffda RBX: 00005590b57cef40 RCX: 00007f9321f7a479 > RDX: 0000000000000000 RSI: 00005590b5100c7c RDI: 0000000000000007 > RBP: 0000000000000000 R08: 00007f9322043b20 R09: 0000000000000000 > R10: 0000000000000050 R11: 0000000000000246 R12: 0000000000040000 > R13: 00005590b5100c7c R14: 00005590b57cefe0 R15: 0000000000000000 > </TASK> > Modules linked in: nfsv4 dns_resolver nfs lockd grace ttm_device_test ttm_kunit_helpers drm_kunit_helpers kunit rfkill dm_crypt nhpoly1305_avx2 nhpoly1305 chacha_generic chacha_x86_64 libchacha adiantum libpoly1305 algif_skcipher input_leds joydev hid_generic usbhid hid amdgpu snd_hda_codec_hdmi amd64_edac snd_hda_intel amdxcp mfd_core snd_intel_dspcfg edac_mce_amd gpu_sched snd_hda_codec video snd_hwdep drm_suballoc_helper snd_hda_core i2c_algo_bit drm_ttm_helper snd_pcm wmi_bmof ttm snd_timer evdev drm_exec snd drm_display_helper soundcore kvm_amd k10temp drm_buddy rapl wmi gpio_amdpt gpio_generic button lz4 lz4_compress lz4_decompress zram sg nct6775 nct6775_core hwmon_vid hwmon loop configfs sha512_ssse3 sha512_generic sha256_ssse3 sha1_ssse3 sha1_generic aesni_intel libaes crypto_simd cryptd xhci_pci xhci_hcd ccp usbcore usb_common sunrpc dm_mod pkcs8_key_parser efivarfs > ---[ end trace 0000000000000000 ]--- > RIP: 0010:__list_add_valid_or_report+0x67/0x9c > Code: c7 c7 26 ff c4 90 48 89 c6 e8 2f 32 ca ff 0f 0b 4c 8b 02 49 39 f0 74 14 48 89 d1 48 c7 c7 78 ff c4 90 4c 89 c2 e8 13 32 ca ff <0f> 0b 48 39 d7 74 05 4c 39 c7 75 17 48 89 f1 48 89 c2 48 89 fe 48 > RSP: 0018:ffffb23b05d27df8 EFLAGS: 00010246 > RAX: 0000000000000075 RBX: 0000000000000000 RCX: 0000000000000000 > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 > RBP: ffffa0b1a5c034f0 R08: 0000000000000000 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000000 R12: ffffa0b1843b2628 > R13: ffffa0b1b7c1f478 R14: ffffffffc0696480 R15: ffffa0b1a5c11000 > FS: 00007f9321e6ec40(0000) GS:ffffa0b85f080000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00005592ea51ef40 CR3: 0000000189590000 CR4: 0000000000b50ef0 > ============================================================================= > BUG task_struct (Tainted: G D N): Poison overwritten > ----------------------------------------------------------------------------- > > 0xffffa0b170ab6068-0xffffa0b170ab6068 @offset=24680. First byte 0x6c instead of 0x6b > Slab 0xffffea8944c2ac00 objects=8 used=8 fp=0x0000000000000000 flags=0x4000000000000840(slab|head|zone=1) > Object 0xffffa0b170ab6040 @offset=24640 fp=0x0000000000000000 > > Redzone ffffa0b170ab6000: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ > Redzone ffffa0b170ab6010: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ > Redzone ffffa0b170ab6020: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ > Redzone ffffa0b170ab6030: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ > Object ffffa0b170ab6040: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > Object ffffa0b170ab6050: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > Object ffffa0b170ab6060: 6b 6b 6b 6b 6b 6b 6b 6b 6c 6b 6b 6b 6b 6b 6b 6b kkkkkkkklkkkkkkk > Object ffffa0b170ab6070: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > [...] > Object ffffa0b170ab6fb0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > Object ffffa0b170ab6fc0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkkkkkkkkkkkkkk. > Redzone ffffa0b170ab6fd0: bb bb bb bb bb bb bb bb ........ > Padding ffffa0b170ab6fe0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ > Padding ffffa0b170ab6ff0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ > CPU: 13 PID: 2 Comm: kthreadd Tainted: G D N 6.7.5-Zen3 #1 > Hardware name: To Be Filled By O.E.M. B550M Pro4/B550M Pro4, BIOS P3.40 01/18/2024 > Call Trace: > <TASK> > dump_stack_lvl+0x37/0x52 > check_bytes_and_report+0xa7/0x107 > check_object+0x157/0x253 > alloc_debug_processing+0x5d/0x111 > ___slab_alloc+0x288/0x561 > ? copy_process+0x35f/0x2276 > ? kthread_is_per_cpu+0x22/0x22 > ret_from_fork+0x23/0x35 > ? kthread_is_per_cpu+0x22/0x22 > ret_from_fork_asm+0x11/0x20 > </TASK> > FIX task_struct: Restoring Poison 0xffffa0b170ab6068-0xffffa0b170ab6068=0x6b > FIX task_struct: Marking all objects used > > > The Talos II ppc64 trace looks a bit different: > > [...] > KTAP version 1 > 1..1 > KTAP version 1 > # Subtest: ttm_pool > # module: ttm_pool_test > 1..8 > KTAP version 1 > # Subtest: ttm_pool_alloc_basic > ok 1 One page > ok 2 More than one page > ok 3 Above the allocation limit > # ttm_pool_alloc_basic: ASSERTION FAILED at drivers/gpu/drm/ttm/tests/ttm_pool_test.c:162 > Expected err == 0, but > err == -12 (0xfffffffffffffff4) > not ok 4 One page, with coherent DMA mappings enabled > list_add corruption. prev->next should be next (c00800000cf64fc0), but was 0000000000000000. (prev=c0002000061a4ad0). > ------------[ cut here ]------------ > kernel BUG at lib/list_debug.c:32! > Oops: Exception in kernel mode, sig: 5 [#1] > BE PAGE_SIZE=4K MMU=Radix SMP NR_CPUS=32 NUMA PowerNV > Modules linked in: ttm_pool_test ttm_kunit_helpers drm_kunit_helpers kunit snd_hrtimer snd_seq snd_seq_device snd_timer snd soundcore cfg80211 rfkill input_leds evdev hid_generic usbhid hid radeon xts xhci_pci ctr xhci_hcd drm_suballoc_helper i2c_algo_bit drm_ttm_helper cbc ttm aes_generic ofpart usbcore libaes powernv_flash drm_display_helper at24 vmx_crypto gf128mul mtd backlight usb_common regmap_i2c opal_prd ibmpowernv lz4 lz4_compress lz4_decompress zram pkcs8_key_parser powernv_cpufreq loop dm_mod configfs > CPU: 29 PID: 934 Comm: kunit_try_catch Tainted: G TN 6.7.5-gentoo-P9 #1 > Hardware name: T2P9D01 REV 1.01 POWER9 0x4e1202 opal:skiboot-bc106a0 PowerNV > NIP: c000000000864744 LR: c000000000864740 CTR: 0000000000000000 > REGS: c000200015333a30 TRAP: 0700 Tainted: G TN (6.7.5-gentoo-P9) > MSR: 9000000000029032 <SF,HV,EE,ME,IR,DR,RI> CR: 24000222 XER: 00000000 > CFAR: c0000000001d5620 IRQMASK: 0 > GPR00: 0000000000000000 c000200015333cd0 c0000000011b4700 0000000000000075 > GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > GPR12: 0000000000000000 c0002007fa4d5e00 c000000000182548 c0002000066aa1c0 > GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > GPR24: 0000000000000000 c0002000061a4010 c00800000cf64fc0 c0002000061a4020 > GPR28: c0002000061a4ad0 c00800000cf64fa8 c00800000cf64fa0 c0002000061a4010 > NIP [c000000000864744] __list_add_valid_or_report+0xd4/0x120 > LR [c000000000864740] __list_add_valid_or_report+0xd0/0x120 > Call Trace: > [c000200015333cd0] [c000000000864740] __list_add_valid_or_report+0xd0/0x120 (unreliable) > [c000200015333d30] [c00800000cf5eed8] ttm_pool_type_init+0xa0/0x120 [ttm] > [c000200015333d80] [c00800000cf5efec] ttm_pool_init+0x94/0x170 [ttm] > [c000200015333de0] [c00800000cc6b324] ttm_pool_alloc_basic+0x9c/0x670 [ttm_pool_test] > [c000200015333ea0] [c00800000bddf7f0] kunit_try_run_case+0xb8/0x220 [kunit] > [c000200015333f60] [c00800000bde27c8] kunit_generic_run_threadfn_adapter+0x30/0x50 [kunit] > [c000200015333f90] [c000000000182670] kthread+0x130/0x140 > [c000200015333fe0] [c00000000000d030] start_kernel_thread+0x14/0x18 > Code: f8010070 4b970ea9 60000000 0fe00000 7c0802a6 3c62fff1 7d064378 7d244b78 38639600 f8010070 4b970e85 60000000 <0fe00000> 7c0802a6 3c62fff1 7ca62b78 > ---[ end trace 0000000000000000 ]--- > > note: kunit_try_catch[934] exited with irqs disabled > # ttm_pool_alloc_basic: try timed out > BUG: Unable to handle kernel data access at 0x6b6b6b6b6b6b6b6b > Faulting instruction address: 0xc000000000181ae4 > Oops: Kernel access of bad area, sig: 11 [#2] > BE PAGE_SIZE=4K MMU=Radix SMP NR_CPUS=32 NUMA PowerNV > Modules linked in: ttm_pool_test ttm_kunit_helpers drm_kunit_helpers kunit snd_hrtimer snd_seq snd_seq_device snd_timer snd soundcore cfg80211 rfkill input_leds evdev hid_generic usbhid hid radeon xts xhci_pci ctr xhci_hcd drm_suballoc_helper i2c_algo_bit drm_ttm_helper cbc ttm aes_generic ofpart usbcore libaes powernv_flash drm_display_helper at24 vmx_crypto gf128mul mtd backlight usb_common regmap_i2c opal_prd ibmpowernv lz4 lz4_compress lz4_decompress zram pkcs8_key_parser powernv_cpufreq loop dm_mod configfs > CPU: 17 PID: 921 Comm: modprobe Tainted: G D TN 6.7.5-gentoo-P9 #1 > Hardware name: T2P9D01 REV 1.01 POWER9 0x4e1202 opal:skiboot-bc106a0 PowerNV > NIP: c000000000181ae4 LR: c00800000bde2a54 CTR: c000000000181a80 > REGS: c0002000153871b0 TRAP: 0380 Tainted: G D TN (6.7.5-gentoo-P9) > MSR: 900000000280b032 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI> CR: 44422282 XER: 00000000 > CFAR: c00800000bde53ec IRQMASK: 0 > GPR00: c00800000bde2a54 c000200015387450 c0000000011b4700 c0000000b1e34d00 > GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > GPR08: 0000000000000000 0000000000000000 000000006b6b6b6c c00800000bde53d8 > GPR12: c000000000181a80 c0002007fa4dd600 0000000020000000 0000000020000000 > GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > GPR20: 0000000000000002 0000000020000000 c0000000023d78f8 c0000000023d78a8 > GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > GPR28: c0002000153876c0 6b6b6b6b6b6b6b6b c0000000b1e34d00 c0000000b1e34eb8 > NIP [c000000000181ae4] kthread_stop+0x64/0x1c0 > LR [c00800000bde2a54] kunit_try_catch_run+0x26c/0x2c0 [kunit] > Call Trace: > [c000200015387450] [c0000000001d5934] vprintk+0x84/0xc0 (unreliable) > [c000200015387490] [c00800000bde2a54] kunit_try_catch_run+0x26c/0x2c0 [kunit] > [c000200015387540] [c00800000bde4f14] kunit_run_case_catch_errors+0x60/0xf0 [kunit] > [c0002000153875a0] [c00800000bddf448] kunit_run_tests+0x560/0x680 [kunit] > [c0002000153878d0] [c00800000bddf614] __kunit_test_suites_init+0xac/0x160 [kunit] > [c000200015387970] [c00800000bde349c] kunit_exec_run_tests+0x44/0xb0 [kunit] > [c0002000153879f0] [c00800000bddecbc] kunit_module_notify+0x4d4/0x590 [kunit] > [c000200015387a90] [c0000000001842f0] notifier_call_chain+0xa0/0x190 > [c000200015387b30] [c00000000018480c] blocking_notifier_call_chain+0x5c/0xb0 > [c000200015387b70] [c00000000020cf64] do_init_module+0x234/0x330 > [c000200015387bf0] [c00000000021054c] init_module_from_file+0x9c/0xf0 > [c000200015387cc0] [c000000000210740] sys_finit_module+0x190/0x420 > [c000200015387d80] [c00000000002b808] system_call_exception+0x1b8/0x3a0 > [c000200015387e50] [c00000000000c270] system_call_vectored_common+0xf0/0x280 > --- interrupt: 3000 at 0x3fff9eb3d7c8 > NIP: 00003fff9eb3d7c8 LR: 0000000000000000 CTR: 0000000000000000 > REGS: c000200015387e80 TRAP: 3000 Tainted: G D TN (6.7.5-gentoo-P9) > MSR: 900000000280f032 <SF,HV,VEC,VSX,EE,PR,FP,ME,IR,DR,RI> CR: 48422244 XER: 00000000 > IRQMASK: 0 > GPR00: 0000000000000161 00003fffc80d3ab0 00003fff9ec37100 0000000000000007 > GPR04: 0000000134f6df90 0000000000000000 000000000000001f 0000000000000045 > GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > GPR12: 0000000000000000 00003fff9ef7fbe0 0000000020000000 0000000020000000 > GPR16: 0000000000000000 0000000000000000 0000000000000020 0000000020000000 > GPR20: 0000000161994850 0000000020000000 0000000000000000 0000000000000000 > GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000161993f90 > GPR28: 0000000134f6df90 0000000000040000 0000000000000000 0000000161993cc0 > NIP [00003fff9eb3d7c8] 0x3fff9eb3d7c8 > LR [0000000000000000] 0x0 > --- interrupt: 3000 > Code: 40c2fff4 2c090000 41820164 39490001 7d494b78 2c090000 418000f4 813e01a8 6d290020 79295fe2 0b090000 ebbe0738 <7d20e8a8> 61290002 7d20e9ad 40c2fff4 > ---[ end trace 0000000000000000 ]--- > > note: modprobe[921] exited with irqs disabled > ============================================================================= > BUG task_struct (Tainted: G D TN): Poison overwritten > ----------------------------------------------------------------------------- > > 0xc0000000b1e34ebb-0xc0000000b1e34ebb @offset=20155. First byte 0x6c instead of 0x6b > Slab 0xc00c000002c78c00 objects=5 used=4 fp=0xc0000000b1e33380 flags=0x7ffc0000000840(slab|head|node=0|zone=0|lastcpupid=0x1fff) > Object 0xc0000000b1e34d00 @offset=19712 fp=0xc0000000b1e33380 > > Redzone c0000000b1e34c80: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ > Redzone c0000000b1e34c90: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ > Redzone c0000000b1e34ca0: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ > Redzone c0000000b1e34cb0: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ > Redzone c0000000b1e34cc0: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ > Redzone c0000000b1e34cd0: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ > Redzone c0000000b1e34ce0: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ > Redzone c0000000b1e34cf0: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ > Object c0000000b1e34d00: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > Object c0000000b1e34d10: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > Object c0000000b1e34d20: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > Object c0000000b1e34d30: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > Object c0000000b1e34d40: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > Object c0000000b1e34d50: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > Object c0000000b1e34d60: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > Object c0000000b1e34d70: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > Object c0000000b1e34d80: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > Object c0000000b1e34d90: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > Object c0000000b1e34da0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > Object c0000000b1e34db0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > Object c0000000b1e34dc0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > Object c0000000b1e34dd0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > Object c0000000b1e34de0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > Object c0000000b1e34df0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > Object c0000000b1e34e00: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > Object c0000000b1e34e10: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > Object c0000000b1e34e20: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > Object c0000000b1e34e30: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > Object c0000000b1e34e40: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > Object c0000000b1e34e50: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > Object c0000000b1e34e60: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > Object c0000000b1e34e70: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > Object c0000000b1e34e80: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > Object c0000000b1e34e90: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > Object c0000000b1e34ea0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > Object c0000000b1e34eb0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6c 6b 6b 6b 6b kkkkkkkkkkklkkkk > Object c0000000b1e34ec0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > [...] > Object c0000000b1e35cf0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk > Redzone c0000000b1e36580: bb bb bb bb bb bb bb bb ........ > Padding c0000000b1e36590: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ > Padding c0000000b1e365a0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ > Padding c0000000b1e365b0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ > Padding c0000000b1e365c0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ > Padding c0000000b1e365d0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ > Padding c0000000b1e365e0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ > Padding c0000000b1e365f0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ > CPU: 28 PID: 2 Comm: kthreadd Tainted: G D TN 6.7.5-gentoo-P9 #1 > Hardware name: T2P9D01 REV 1.01 POWER9 0x4e1202 opal:skiboot-bc106a0 PowerNV > Call Trace: > [c00000000593b890] [c000000000e8ecf8] dump_stack_lvl+0x6c/0xb0 (unreliable) > [c00000000593b8c0] [c00000000041dad0] print_trailer+0x1e0/0x22c > [c00000000593b940] [c0000000004155f4] check_bytes_and_report+0x224/0x240 > [c00000000593b9f0] [c00000000041596c] check_object+0x35c/0x4a0 > [c00000000593ba40] [c0000000004168dc] alloc_debug_processing+0xdc/0x270 > [c00000000593bac0] [c000000000416c8c] get_partial_node.part.0+0x21c/0x460 > [c00000000593bb80] [c000000000417148] ___slab_alloc+0x278/0xb20 > [c00000000593bc90] [c000000000417b3c] kmem_cache_alloc_node+0x14c/0x630 > [c00000000593bd20] [c000000000140618] copy_process+0x408/0x3270 > [c00000000593be00] [c0000000001435f4] kernel_clone+0xc4/0x5b0 > [c00000000593be80] [c000000000143dc4] kernel_thread+0x84/0xc0 > [c00000000593bf40] [c0000000001829bc] kthreadd+0x1ec/0x290 > [c00000000593bfe0] [c00000000000d030] start_kernel_thread+0x14/0x18 > FIX task_struct: Restoring Poison 0xc0000000b1e34ebb-0xc0000000b1e34ebb=0x6b > FIX task_struct: Marking all objects used > > > Full dmesg and kernel .config of both machines attached. > > Regards, > Erhard > [ 0.000000] Linux version 6.7.5-Zen3 (root@supah) (gcc (Gentoo 13.2.1_p20240113-r1 p12) 13.2.1 20240113, GNU ld (Gentoo 2.41 p5) 2.41.0) #1 SMP Mon Feb 19 12:44:46 -00 2024 Is it vanilla kernel (i.e. no patches applied)? Can you also check current mainline (v6.8-rc5)? Confused... -- An old man doll... just what I always wanted! - Clara
Attachment:
signature.asc
Description: PGP signature