Hi Nathan,
On Mon, Oct 21, 2024 at 03:15:19PM -0700, Nathan Chancellor wrote:
Hi Mike,
On Wed, Oct 16, 2024 at 03:24:22PM +0300, Mike Rapoport wrote:
From: "Mike Rapoport (Microsoft)" <rppt@xxxxxxxxxx>
When module text memory will be allocated with ROX permissions, the
memory at the actual address where the module will live will contain
invalid instructions and there will be a writable copy that contains the
actual module code.
Update relocations and alternatives patching to deal with it.
Signed-off-by: Mike Rapoport (Microsoft) <rppt@xxxxxxxxxx>
Sorry that you have to hear from me again :) It seems that module
loading is still broken with this version of the patch, which is
something that I missed in my earlier testing since I only test a
monolithic kernel with my regular virtual machine testing. If I build
and install the kernel and modules in the VM via a distribution package,
I get the following splat at boot:
Starting systemd-udevd version 256.7-1-arch
[ 0.882312] SMP alternatives: Something went horribly wrong trying to rewrite the CFI implementation.
[ 0.883526] CFI failure at do_one_initcall+0x128/0x380 (target: init_module+0x0/0xff0 [crc32c_intel]; expected type: 0x0c7a3a22)
[ 0.884802] Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[ 0.885434] CPU: 3 UID: 0 PID: 157 Comm: modprobe Tainted: G W 6.12.0-rc3-debug-next-20241021-06324-g63b3ff03d91a #1 291f0fd70f293827edec681d3c5304f5807a3c7b
[ 0.887084] Tainted: [W]=WARN
[ 0.887409] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown 2/2/2022
[ 0.888241] RIP: 0010:do_one_initcall+0x128/0x380
[ 0.888720] Code: f3 0f 1e fa 41 be ff ff ff ff e9 0f 01 00 00 0f 1f 44 00 00 41 81 e7 ff ff ff 7f 49 89 db 41 ba de c5 85 f3 45 03 53 f1 74 02 <0f> 0b 41 ff d3 0f 1f 00 41 89 c6 0f 1f 44 00 00 c6 04 24 00 65 8b
[ 0.890598] RSP: 0018:ff3f93e5c052f970 EFLAGS: 00010217
[ 0.891129] RAX: ffffffffb4c105b8 RBX: ffffffffc0602010 RCX: 0000000000000000
[ 0.891850] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffc0602010
[ 0.892588] RBP: ff3f93e5c052fc88 R08: 0000000000000020 R09: 0000000000000000
[ 0.893305] R10: 000000002a378b84 R11: ffffffffc0602010 R12: 00000000000069c6
[ 0.894003] R13: ff1f0090c5596900 R14: ff1f0090c15a55c0 R15: 0000000000000000
[ 0.894693] FS: 00007ffb712c0740(0000) GS:ff1f00942fb80000(0000) knlGS:0000000000000000
[ 0.895453] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.896020] CR2: 00007ffffc4424c8 CR3: 0000000100af4002 CR4: 0000000000771ef0
[ 0.896698] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 0.897391] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 0.898077] PKRU: 55555554
[ 0.898337] Call Trace:
[ 0.898577] <TASK>
[ 0.898784] ? __die_body+0x6a/0xb0
[ 0.899132] ? die+0xa4/0xd0
[ 0.899413] ? do_trap+0xa6/0x180
[ 0.899740] ? do_one_initcall+0x128/0x380
[ 0.900130] ? do_one_initcall+0x128/0x380
[ 0.900523] ? handle_invalid_op+0x6a/0x90
[ 0.900917] ? do_one_initcall+0x128/0x380
[ 0.901311] ? exc_invalid_op+0x38/0x60
[ 0.901679] ? asm_exc_invalid_op+0x1a/0x20
[ 0.902081] ? __cfi_init_module+0x10/0x10 [crc32c_intel 5331566c5540f82df397056699bc4ddac8be1306]
[ 0.902933] ? __cfi_init_module+0x10/0x10 [crc32c_intel 5331566c5540f82df397056699bc4ddac8be1306]
[ 0.903781] ? __cfi_init_module+0x10/0x10 [crc32c_intel 5331566c5540f82df397056699bc4ddac8be1306]
[ 0.904634] ? do_one_initcall+0x128/0x380
[ 0.905028] ? idr_alloc_cyclic+0x139/0x1d0
[ 0.905437] ? security_kernfs_init_security+0x54/0x190
[ 0.905958] ? __kernfs_new_node+0x1ba/0x240
[ 0.906377] ? sysfs_create_dir_ns+0x8f/0x140
[ 0.906795] ? kernfs_link_sibling+0xf2/0x110
[ 0.907211] ? kernfs_activate+0x2c/0x110
[ 0.907599] ? kernfs_add_one+0x108/0x150
[ 0.907981] ? __kernfs_create_file+0x75/0xa0
[ 0.908407] ? sysfs_create_bin_file+0xc6/0x120
[ 0.908853] ? __vunmap_range_noflush+0x347/0x420
[ 0.909313] ? _raw_spin_unlock+0xe/0x30
[ 0.909692] ? free_unref_page+0x22c/0x4c0
[ 0.910097] ? __kmalloc_cache_noprof+0x1a8/0x360
[ 0.910546] do_init_module+0x60/0x250
[ 0.910910] __se_sys_finit_module+0x316/0x420
[ 0.911351] do_syscall_64+0x88/0x170
[ 0.911699] ? __x64_sys_lseek+0x68/0xb0
[ 0.912077] ? syscall_exit_to_user_mode+0x97/0xc0
[ 0.912538] ? do_syscall_64+0x94/0x170
[ 0.912902] ? syscall_exit_to_user_mode+0x97/0xc0
[ 0.913353] ? do_syscall_64+0x94/0x170
[ 0.913709] ? clear_bhb_loop+0x45/0xa0
[ 0.914071] ? clear_bhb_loop+0x45/0xa0
[ 0.914428] ? clear_bhb_loop+0x45/0xa0
[ 0.914767] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 0.915089] RIP: 0033:0x7ffb713dc1fd
[ 0.915316] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e3 fa 0c 00 f7 d8 64 89 01 48
[ 0.916491] RSP: 002b:00007ffffc4454a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[ 0.916964] RAX: ffffffffffffffda RBX: 000055f28c6a5420 RCX: 00007ffb713dc1fd
[ 0.917413] RDX: 0000000000000000 RSI: 000055f26c40cc03 RDI: 0000000000000003
[ 0.917858] RBP: 00007ffffc445560 R08: 0000000000000001 R09: 00007ffffc4454f0
[ 0.918302] R10: 0000000000000040 R11: 0000000000000246 R12: 000055f26c40cc03
[ 0.918748] R13: 0000000000060000 R14: 000055f28c6a4b50 R15: 000055f28c6ac5b0
[ 0.919211] </TASK>
[ 0.919356] Modules linked in: crc32c_intel(+)
[ 0.919661] ---[ end trace 0000000000000000 ]---
I also see some other WARNs interleaved along the lines of
[ 0.982759] no CFI hash found at: 0xffffffffc0608000 ffffffffc0608000 cc cc cc cc cc
[ 0.982767] WARNING: CPU: 5 PID: 170 at arch/x86/kernel/alternative.c:1204 __apply_fineibt+0xa6d/0xab0
The console appears to be a bit of a mess after that initial message.
If there is any more information I can provide or patches I can test, I
am more than happy to do so.
I've got similar report from kbuild bot a few days ago:
https://lore.kernel.org/all/202410202257.b7edc376-lkp@xxxxxxxxx
I fixed fineibt handling in v7:
https://lore.kernel.org/linux-mm/20241023162711.2579610-1-rppt@xxxxxxxxxx
Cheers,
Nathan
--
Sincerely yours,
Mike.