On Wed, Sep 21, 2022, Vitaly Kuznetsov wrote: > Sean Christopherson <seanjc@xxxxxxxxxx> writes: > > > On Wed, Sep 21, 2022, Sean Christopherson wrote: > >> On Wed, Sep 21, 2022, Vitaly Kuznetsov wrote: > >> > [ 962.257992] ept_fetch+0x504/0x5a0 [kvm] > >> > [ 962.261959] ept_page_fault+0x2d7/0x300 [kvm] > >> > [ 962.287701] kvm_mmu_page_fault+0x258/0x290 [kvm] > >> > [ 962.292451] vmx_handle_exit+0xe/0x40 [kvm_intel] > >> > [ 962.297173] vcpu_enter_guest+0x665/0xfc0 [kvm] > >> > [ 962.307580] vcpu_run+0x33/0x250 [kvm] > >> > [ 962.311367] kvm_arch_vcpu_ioctl_run+0xf7/0x460 [kvm] > >> > [ 962.316456] kvm_vcpu_ioctl+0x271/0x670 [kvm] > >> > [ 962.320843] __x64_sys_ioctl+0x87/0xc0 > >> > [ 962.324602] do_syscall_64+0x38/0x90 > >> > [ 962.328192] entry_SYSCALL_64_after_hwframe+0x63/0xcd > >> > >> Ugh, past me completely forgot the basics of shadow paging[*]. The shadow MMU > >> can reuse existing shadow pages, whereas the TDP MMU always links in new pages. > >> > >> I got turned around by the "doesn't exist" check, which only means "is there > >> already a _SPTE_ here", not "is there an existing SP for the target gfn+role that > >> can be used". > >> > >> I'll drop the series from the queue, send a new pull request, and spin a v5 > >> targeting 6.2, which amusing will look a lot like v1... > > > > Huh. I was expecting more churn, but dropping the offending patch and then > > "reworking" the series yields a very trivial overall diff. > > > > Vitaly, can you easily re-test with the below, i.e. simply delete the > > KVM_BUG_ON()? > > This seems to work! At least, I haven't noticed anything weird when > booting my beloved Win11 + WSL2 guest. I finally figured out why I didn't see this in testing. It _should_ have fired during kernel boot when testing legacy shadow paging, i.e. ept=0, as the bug requires nothing more than executing from two GVAs pointing at the same huge 2mb GPA. I did test ept=0, but all of my normal test systems aren't susceptible to L1TF (KVM guest, all AMD, and ICX), i.e. don't enable the mitigation by default. I also tested those systems with the mitigation forced on and ept=0, but never booted a VM with that combination, and neither KUT nor selftests does the requisite aliasing with huge pages. Death was instantaneous once I forced the mitigation on with ept=0 and booted a VM. *sigh*