On Tue, Mar 26, 2024 at 08:15:12PM +0100, Mirsad Todorovac wrote: > On 3/26/24 11:16, Borislav Petkov wrote: > > On Wed, Mar 20, 2024 at 02:28:57AM +0100, Mirsad Todorovac wrote: > > > Please find the kernel .config attached. > > > > Thanks, that's one huuuge kernel you're building. :) > > > > > I got another one of these "Unpatched thunk" and it seems connected > > > with selftest/kvm. > > > > > > But running selftests/kvm one by one did not trigger the bug. > > > > Which commands are you exactly running? > > > > I'll try to reproduce here. > > I think I have a reproducer here on the latest torvalds vanilla tree (on Ubuntu 22.04 LTS box): > > root# tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh > Running test with CAP_SYS_BOOT enabled > Running as root, skipping nx_huge_pages_test with CAP_SYS_BOOT disabled > root# git describe > v6.9-rc1-5-g928a87efa423 > root# I'm seeing it pretty consistently on kvm/next as well. Not sure if there's anything special about my config but starting a fairly basic SVM guest seems to be enough to trigger it for me on the first invocation of svm_vcpu_run(). It seems to be 2 call-sites, one inside: amd_clear_divider() and another inside: __svm_vcpu_run() which seems to match up with the decoded stack you posted here. Maybe the first case would be easiest to focus on? It's a fairly straight-forward use of ALTERNATIVE(): void noinstr amd_clear_divider(void) { asm volatile(ALTERNATIVE("", "div %2\n\t", X86_BUG_DIV0) :: "a" (0), "d" (0), "r" (1)); } EXPORT_SYMBOL_GPL(amd_clear_divider); and it's been that way since before 4461438a84 ("x86/retpoline: Ensure default return thunk isn't used at runtime") was added. Not sure if anything else has changed underneath the covers since 4461438a84. -Mike > > > Thx. > > Not at all. > > The stacktrace for the bug triggered by the above command was: > > kernel: [ 101.973612] ------------[ cut here ]------------ > kernel: [ 101.973615] Unpatched return thunk in use. This should not happen! > kernel: [ 101.973618] WARNING: CPU: 1 PID: 3827 at arch/x86/kernel/cpu/bugs.c:2935 __warn_thunk (./arch/x86/kernel/cpu/bugs.c:2935 (discriminator 3)) > kernel: [ 101.973625] Modules linked in: xfrm_user nf_tables nfnetlink nvme_fabrics binfmt_misc snd_hda_codec_realtek snd_hda_codec_generic snd_hda_scodec_component snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec intel_rapl_msr amd_atl snd_hda_core intel_rapl_common nls_iso8859_1 snd_hwdep snd_pcm edac_mce_amd amdgpu crct10dif_pclmul polyval_clmulni snd_seq_midi polyval_generic snd_seq_midi_event ghash_clmulni_intel sha512_ssse3 snd_rawmidi sha256_ssse3 amdxcp sha1_ssse3 drm_exec aesni_intel snd_seq gpu_sched crypto_simd drm_buddy cryptd drm_suballoc_helper drm_ttm_helper snd_seq_device joydev input_leds rapl ttm snd_timer wmi_bmof drm_display_helper cec snd drm_kms_helper k10temp ccp i2c_algo_bit soundcore mac_hid tcp_bbr msr parport_pc ppdev lp parport drm efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq hid_generic nvme r8169 xhci_pci ahci nvme_core crc32_pclmul i2c_piix4 xhci_pci_renesas nvme_auth realtek libahci video wmi gpio_amdpt > kernel: [ 101.973685] CPU: 1 PID: 3827 Comm: nx_huge_pages_t Not tainted 6.9.0-rc1-torv-00005-g928a87efa423-dirty #36 > kernel: [ 101.973687] Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023 > kernel: [ 101.973688] RIP: 0010:__warn_thunk (./arch/x86/kernel/cpu/bugs.c:2935 (discriminator 3)) > kernel: [ 101.973691] Code: 62 c5 1d 01 83 e3 01 74 0e 48 8b 5d f8 c9 31 f6 31 ff e9 be 98 3b 01 48 c7 c7 98 21 c1 bc c6 05 22 26 8d 02 01 e8 90 aa 07 00 <0f> 0b 48 8b 5d f8 c9 31 f6 31 ff e9 9b 98 3b 01 90 90 90 90 90 90 > All code > ======== > 0: 62 c5 1d 01 83 (bad) > 5: e3 01 jrcxz 0x8 > 7: 74 0e je 0x17 > 9: 48 8b 5d f8 mov -0x8(%rbp),%rbx > d: c9 leave > e: 31 f6 xor %esi,%esi > 10: 31 ff xor %edi,%edi > 12: e9 be 98 3b 01 jmp 0x13b98d5 > 17: 48 c7 c7 98 21 c1 bc mov $0xffffffffbcc12198,%rdi > 1e: c6 05 22 26 8d 02 01 movb $0x1,0x28d2622(%rip) # 0x28d2647 > 25: e8 90 aa 07 00 call 0x7aaba > 2a:* 0f 0b ud2 <-- trapping instruction > 2c: 48 8b 5d f8 mov -0x8(%rbp),%rbx > 30: c9 leave > 31: 31 f6 xor %esi,%esi > 33: 31 ff xor %edi,%edi > 35: e9 9b 98 3b 01 jmp 0x13b98d5 > 3a: 90 nop > 3b: 90 nop > 3c: 90 nop > 3d: 90 nop > 3e: 90 nop > 3f: 90 nop > > Code starting with the faulting instruction > =========================================== > 0: 0f 0b ud2 > 2: 48 8b 5d f8 mov -0x8(%rbp),%rbx > 6: c9 leave > 7: 31 f6 xor %esi,%esi > 9: 31 ff xor %edi,%edi > b: e9 9b 98 3b 01 jmp 0x13b98ab > 10: 90 nop > 11: 90 nop > 12: 90 nop > 13: 90 nop > 14: 90 nop > 15: 90 nop > kernel: [ 101.973692] RSP: 0018:ffffbbd90580fc90 EFLAGS: 00010046 > kernel: [ 101.973694] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 > kernel: [ 101.973695] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 > kernel: [ 101.973696] RBP: ffffbbd90580fc98 R08: 0000000000000000 R09: 0000000000000000 > kernel: [ 101.973697] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9964e4b7d4f0 > kernel: [ 101.973698] R13: 0000000000000000 R14: 0000000000000000 R15: ffff9964e4b7dc70 > kernel: [ 101.973699] FS: 0000720b95372740(0000) GS:ffff9973d7a80000(0000) knlGS:0000000000000000 > kernel: [ 101.973700] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > kernel: [ 101.973701] CR2: 0000000000000000 CR3: 00000001aea6c000 CR4: 0000000000f50ef0 > kernel: [ 101.973703] PKRU: 55555554 > kernel: [ 101.973703] Call Trace: > kernel: [ 101.973704] <TASK> > kernel: [ 101.973706] ? show_regs (./arch/x86/kernel/dumpstack.c:479) > kernel: [ 101.973709] ? __warn_thunk (./arch/x86/kernel/cpu/bugs.c:2935 (discriminator 3)) > kernel: [ 101.973711] ? __warn (./kernel/panic.c:694) > kernel: [ 101.973713] ? __warn_thunk (./arch/x86/kernel/cpu/bugs.c:2935 (discriminator 3)) > kernel: [ 101.973715] ? report_bug (./lib/bug.c:201 ./lib/bug.c:219) > kernel: [ 101.973718] ? irq_work_queue (./kernel/irq_work.c:119) > kernel: [ 101.973722] ? handle_bug (./arch/x86/kernel/traps.c:218) > kernel: [ 101.973725] ? exc_invalid_op (./arch/x86/kernel/traps.c:260 (discriminator 1)) > kernel: [ 101.973727] ? asm_exc_invalid_op (././arch/x86/include/asm/idtentry.h:621) > kernel: [ 101.973731] ? __warn_thunk (./arch/x86/kernel/cpu/bugs.c:2935 (discriminator 3)) > kernel: [ 101.973734] warn_thunk_thunk (./arch/x86/entry/entry.S:48) > kernel: [ 101.973738] svm_vcpu_enter_exit (././include/linux/kvm_host.h:547 ./arch/x86/kvm/svm/svm.c:4115) > kernel: [ 101.973740] svm_vcpu_run (././arch/x86/include/asm/cpufeature.h:171 ./arch/x86/kvm/svm/svm.c:4186) > kernel: [ 101.973744] kvm_arch_vcpu_ioctl_run (./arch/x86/kvm/x86.c:11008 ./arch/x86/kvm/x86.c:11211 ./arch/x86/kvm/x86.c:11437) > kernel: [ 101.973747] ? srso_alias_return_thunk (./arch/x86/lib/retpoline.S:181) > kernel: [ 101.973750] ? srso_alias_return_thunk (./arch/x86/lib/retpoline.S:181) > kernel: [ 101.973752] ? kvm_vm_stats_read (./arch/x86/kvm/../../../virt/kvm/kvm_main.c:5066) > kernel: [ 101.973755] kvm_vcpu_ioctl (./arch/x86/kvm/../../../virt/kvm/kvm_main.c:4464) > kernel: [ 101.973757] ? srso_alias_return_thunk (./arch/x86/lib/retpoline.S:181) > kernel: [ 101.973759] ? trace_hardirqs_on_prepare (./kernel/trace/trace_preemptirq.c:47 ./kernel/trace/trace_preemptirq.c:42) > kernel: [ 101.973761] ? srso_alias_return_thunk (./arch/x86/lib/retpoline.S:181) > kernel: [ 101.973763] ? syscall_exit_to_user_mode (./kernel/entry/common.c:221) > kernel: [ 101.973765] ? srso_alias_return_thunk (./arch/x86/lib/retpoline.S:181) > kernel: [ 101.973767] ? do_syscall_64 (././arch/x86/include/asm/cpufeature.h:171 ./arch/x86/entry/common.c:98) > kernel: [ 101.973770] __x64_sys_ioctl (./fs/ioctl.c:51 ./fs/ioctl.c:904 ./fs/ioctl.c:890 ./fs/ioctl.c:890) > kernel: [ 101.973773] do_syscall_64 (./arch/x86/entry/common.c:52 ./arch/x86/entry/common.c:83) > kernel: [ 101.973775] ? srso_alias_return_thunk (./arch/x86/lib/retpoline.S:181) > kernel: [ 101.973777] ? irqentry_exit (./kernel/entry/common.c:367) > kernel: [ 101.973778] ? srso_alias_return_thunk (./arch/x86/lib/retpoline.S:181) > kernel: [ 101.973780] entry_SYSCALL_64_after_hwframe (./arch/x86/entry/entry_64.S:129) > kernel: [ 101.973782] RIP: 0033:0x720b9511a94f > kernel: [ 101.973798] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff 77 1f 48 8b 44 24 18 64 48 2b 04 25 28 00 > All code > ======== > 0: 00 48 89 add %cl,-0x77(%rax) > 3: 44 24 18 rex.R and $0x18,%al > 6: 31 c0 xor %eax,%eax > 8: 48 8d 44 24 60 lea 0x60(%rsp),%rax > d: c7 04 24 10 00 00 00 movl $0x10,(%rsp) > 14: 48 89 44 24 08 mov %rax,0x8(%rsp) > 19: 48 8d 44 24 20 lea 0x20(%rsp),%rax > 1e: 48 89 44 24 10 mov %rax,0x10(%rsp) > 23: b8 10 00 00 00 mov $0x10,%eax > 28: 0f 05 syscall > 2a:* 41 89 c0 mov %eax,%r8d <-- trapping instruction > 2d: 3d 00 f0 ff ff cmp $0xfffff000,%eax > 32: 77 1f ja 0x53 > 34: 48 8b 44 24 18 mov 0x18(%rsp),%rax > 39: 64 fs > 3a: 48 rex.W > 3b: 2b .byte 0x2b > 3c: 04 25 add $0x25,%al > 3e: 28 00 sub %al,(%rax) > > Code starting with the faulting instruction > =========================================== > 0: 41 89 c0 mov %eax,%r8d > 3: 3d 00 f0 ff ff cmp $0xfffff000,%eax > 8: 77 1f ja 0x29 > a: 48 8b 44 24 18 mov 0x18(%rsp),%rax > f: 64 fs > 10: 48 rex.W > 11: 2b .byte 0x2b > 12: 04 25 add $0x25,%al > 14: 28 00 sub %al,(%rax) > kernel: [ 101.973799] RSP: 002b:00007ffd786b9ca0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 > kernel: [ 101.973801] RAX: ffffffffffffffda RBX: 0000000000600000 RCX: 0000720b9511a94f > kernel: [ 101.973802] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000005 > kernel: [ 101.973803] RBP: 0000720b953726c0 R08: 000000000041b228 R09: 0000000000000000 > kernel: [ 101.973804] R10: 0000720b951d8882 R11: 0000000000000246 R12: 000000000c9b18c0 > kernel: [ 101.973805] R13: 000000000c9b18c0 R14: 0000000000000000 R15: 0000000000000064 > kernel: [ 101.973809] </TASK> > kernel: [ 101.973810] ---[ end trace 0000000000000000 ]--- > > NOTE: Cc:-ed author of the reproducer for these results. > NOTE 2: The stacktrace is only displayed once, repeating the reproducer doesn't work until the next reboot. > > Sending the latest config as well attached: > > Best regards, > Mirsad Todorovac