On Fri, Jul 19, 2024, Sean Christopherson wrote: > Inject a #GP on a WRMSR(ICR) that attempts to set any reserved bits that > are must-be-zero on both Intel and AMD, i.e. any reserved bits other than > the BUSY bit, which Intel ignores and basically says is undefined. > > KVM's xapic_state_test selftest has been fudging the bug since commit > 4b88b1a518b3 ("KVM: selftests: Enhance handling WRMSR ICR register in > x2APIC mode"), which essentially removed the testcase instead of fixing > the bug. > > WARN if the nodecode path triggers a #GP, as the CPU is supposed to check > reserved bits for ICR when it's partially virtualized. Apparently this isn't accurate, as I've now hit the WARN twice with x2AVIC. I haven't debugged in depth, but it's either INVALID_TARGET and INVALID_INT_TYPE. Which is odd, because the WARN only happens rarely, e.g. appears to be a race of some form. But I wouldn't expect those checks to be subject to races. Ah, but maybe this one is referring to the VALID bit? address is not present in the physical or logical ID tables If that's the case, then (a) ucode is buggy (IMO) and is doing table lookups *before* reserved bits checks, and (b) I don't see a better option than simply deleting the WARN. ------------[ cut here ]------------ WARNING: CPU: 146 PID: 274555 at arch/x86/kvm/lapic.c:2521 kvm_apic_write_nodecode+0x7a/0x90 [kvm] Modules linked in: kvm_amd kvm ... [last unloaded: kvm] CPU: 146 UID: 0 PID: 274555 Comm: qemu Not tainted 6.12.0-smp--41585e8a34cb-sink #458 Hardware name: Google Astoria/astoria, BIOS 0.20240617.0-0 06/17/2024 RIP: 0010:kvm_apic_write_nodecode+0x7a/0x90 [kvm] RSP: 0018:ff51c04b4d133be8 EFLAGS: 00010202 RAX: 0000000000000001 RBX: 0000000000000000 RCX: 00000000000cffff RDX: 0000000087fd0e00 RSI: 00000000000cffff RDI: ff42132c9e336f00 RBP: ff51c04b4d133e50 R08: 0000000000000000 R09: 0000000000060000 R10: ffffffffc067428f R11: ffffffffc080aa20 R12: 00000000000cffff R13: 0000000000000000 R14: ff42132d09e7c2c0 R15: 0000000000000000 FS: 00007fc1af0006c0(0000) GS:ff42138a08500000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000006267e52001 CR4: 0000000000771ef0 PKRU: 00000000 Call Trace: <TASK> avic_incomplete_ipi_interception+0x24a/0x4c0 [kvm_amd] kvm_arch_vcpu_ioctl_run+0x1e11/0x2720 [kvm] kvm_vcpu_ioctl+0x54f/0x630 [kvm] __se_sys_ioctl+0x6b/0xc0 do_syscall_64+0x83/0x160 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7fc1b584624b </TASK> ---[ end trace 0000000000000000 ]---