On Wed, 15 Nov 2023 09:33:17 -0800 Roman Gushchin <roman.gushchin@xxxxxxxxx> wrote: > Hm, interesting, so the issue is happening only with a kernel built with clang-16 > but not gcc? And you use 32-bit kernel? Do you know if it's reproducible on a > 64-bit machine? Correct. This only happens when I build the kernel with clang-16. A gcc-13 kernel build using the same .config is fine. That's why I reported it first on https://github.com/ClangBuiltLinux/linux/issues/1959 Surprisingly I was indeed able to reproduce the issue on my amd64 box! Here also the gcc-13 build is fine and the clang-16 build crashes: [...] KASAN: maybe wild-memory-access in range [0xaaaaaaaaaaaaaab8-0xaaaaaaaaaaaaaabf] CPU: 26 PID: 1 Comm: systemd Not tainted 6.7.0-rc1-Zen3 #1 Hardware name: To Be Filled By O.E.M. B450M Steel Legend/B450M Steel Legend, BIOS P8.01 03/14/2023 RIP: 0010:obj_cgroup_charge_pages+0x27/0x2d5 Code: 90 90 90 55 41 57 41 56 41 55 41 54 53 89 d5 41 89 f6 49 89 ff 48 b8 00 00 00 00 00 fc ff df 49 83 c7 10 4d 89 fd 49 c1 ed 03 <41> 80 7c 05 00 00 74 08 4c 89 ff e8 5e 3a fd ff 49 8b 1f 4c 8d 63 RSP: 0018:ffffc90000067a78 EFLAGS: 00010212 RAX: dffffc0000000000 RBX: aaaaaaaaaaaaaaaa RCX: ffff8887df328b08 RDX: 000000000000000a RSI: 0000000000400cc0 RDI: aaaaaaaaaaaaaaaa RBP: 000000000000000a R08: 3333333333333333 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8887df328b18 R13: 1555555555555557 R14: 0000000000400cc0 R15: aaaaaaaaaaaaaaba FS: 00007fd18c5cb8c0(0000) GS:ffff8887df300000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00005614629e5098 CR3: 0000000108066000 CR4: 0000000000b50ef0 Call Trace: <TASK> ? __die_body+0x16/0x75 ? die_addr+0x4a/0x70 ? exc_general_protection+0x1c9/0x2d0 ? cgroup_mkdir+0x455/0x9fb ? __x64_sys_mkdir+0x69/0x80 ? asm_exc_general_protection+0x26/0x30 ? obj_cgroup_charge_pages+0x27/0x2d5 obj_cgroup_charge+0x114/0x1ab pcpu_alloc+0x1a6/0xa65 ? mem_cgroup_css_alloc+0x1eb/0x1140 ? cgroup_apply_control_enable+0x26b/0x7c0 mem_cgroup_css_alloc+0x23f/0x1140 cgroup_apply_control_enable+0x26b/0x7c0 ? cgroup_kn_set_ugid+0x2d/0x1a0 ? srso_alias_return_thunk+0x5/0xfbef5 cgroup_mkdir+0x455/0x9fb ? __cfi_cgroup_mkdir+0x10/0x10 kernfs_iop_mkdir+0x130/0x170 vfs_mkdir+0x405/0x530 do_mkdirat+0x188/0x1f0 __x64_sys_mkdir+0x69/0x80 do_syscall_64+0x7d/0x100 ? srso_alias_return_thunk+0x5/0xfbef5 ? syscall_exit_to_user_mode+0x23/0xc0 ? srso_alias_return_thunk+0x5/0xfbef5 ? do_syscall_64+0x89/0x100 ? srso_alias_return_thunk+0x5/0xfbef5 ? do_syscall_64+0x89/0x100 ? srso_alias_return_thunk+0x5/0xfbef5 ? do_syscall_64+0x89/0x100 ? srso_alias_return_thunk+0x5/0xfbef5 ? do_syscall_64+0x89/0x100 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7fd18c7216e7 Code: 00 66 90 48 89 f2 b9 00 01 00 00 48 89 fe bf 9c ff ff ff e9 1b cc ff ff 66 2e 0f 1f 84 00 00 00 00 00 90 b8 53 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 19 47 0d 00 f7 d8 64 89 02 b8 RSP: 002b:00007ffd5d347128 EFLAGS: 00000246 ORIG_RAX: 0000000000000053 RAX: ffffffffffffffda RBX: 00005614628edf30 RCX: 00007fd18c7216e7 RDX: 0000000000000000 RSI: 00000000000001ed RDI: 00005614628fbd80 RBP: 00007ffd5d347170 R08: 000000000000000e R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007fd18c8ce39a R13: 00007ffd5d347140 R14: 00000000000000a0 R15: 00005614628c9560 </TASK> Modules linked in: efivarfs dmi_sysfs ---[ end trace 0000000000000000 ]--- RIP: 0010:obj_cgroup_charge_pages+0x27/0x2d5 Code: 90 90 90 55 41 57 41 56 41 55 41 54 53 89 d5 41 89 f6 49 89 ff 48 b8 00 00 00 00 00 fc ff df 49 83 c7 10 4d 89 fd 49 c1 ed 03 <41> 80 7c 05 00 00 74 08 4c 89 ff e8 5e 3a fd ff 49 8b 1f 4c 8d 63 RSP: 0018:ffffc90000067a78 EFLAGS: 00010212 RAX: dffffc0000000000 RBX: aaaaaaaaaaaaaaaa RCX: ffff8887df328b08 RDX: 000000000000000a RSI: 0000000000400cc0 RDI: aaaaaaaaaaaaaaaa RBP: 000000000000000a R08: 3333333333333333 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8887df328b18 R13: 1555555555555557 R14: 0000000000400cc0 R15: aaaaaaaaaaaaaaba FS: 00007fd18c5cb8c0(0000) GS:ffff8887df300000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00005614629e5098 CR3: 0000000108066000 CR4: 0000000000b50ef0 Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b Kernel Offset: 0x37000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) Rebooting in 40 seconds.. Though the trace looks a bit different compared to my 32bit Thinkpad T60 it should be the same issue as reverting your patchset 'fixes' the clang-16 built kernel and the machine boots up ok. > Completely speculative, but can you please check if the following patch > resolves the problem? > > -- > > diff --git a/kernel/fork.c b/kernel/fork.c > index 10917c3e1f03..a0df246e81f0 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -1186,6 +1186,9 @@ static struct task_struct *dup_task_struct(struct task_struct *orig, int node) > #ifdef CONFIG_MEMCG > tsk->active_memcg = NULL; > #endif > +#ifdef CONFIG_MEMCG_KMEM > + tsk->objcg = NULL; > +#endif > > #ifdef CONFIG_CPU_SUP_INTEL > tsk->reported_split_lock = 0; Thanks for looking into this! But the patch did not work out unfortunately. Though only tried on my T60 so far and not on my amd64 box. Also some data about my amd64 box: # inxi -bz System: Kernel: 6.7.0-rc1-Zen3-dirty arch: x86_64 bits: 64 Console: pty pts/0 Distro: Gentoo Base System release 2.14 Machine: Type: Desktop Mobo: ASRock model: B450M Steel Legend serial: <filter> UEFI: American Megatrends v: P8.01 date: 03/14/2023 CPU: Info: 16-core AMD Ryzen 9 5950X [MT MCP] speed (MHz): avg: 682 min/max: 550/5084 Graphics: Device-1: AMD Navi 22 [Radeon RX 6700/6700 XT/6750 XT / 6800M/6850M XT] driver: amdgpu v: kernel Device-2: AMD RV516 [Radeon X1300/X1550 Series] driver: radeon v: kernel Display: x11 server: X.org v: 1.21.1.9 driver: X: loaded: amdgpu unloaded: fbdev,modesetting,radeon dri: radeonsi gpu: amdgpu,radeon resolution: <missing: xdpyinfo/xrandr> resolution: 1: 3840x2160 2: 3840x2160 API: OpenGL v: 4.5 Mesa 23.1.8 renderer: llvmpipe (LLVM 16.0.6 256 bits) Network: Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet driver: r8169 Full dmesg attached (1. without KASAN 2. with KASAN), amd64 kernel .config attached. Regards, Erhard
Attachment:
dmesg_67-rc1_zen3_01
Description: Binary data
Attachment:
dmesg_67-rc1_zen3_02
Description: Binary data
Attachment:
config_67-rc1_zen3
Description: Binary data