On Mon, May 17, 2021 at 7:13 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > On Fri, May 14, 2021, Greg KH wrote: > > On Fri, May 14, 2021 at 01:38:53PM +0200, Jack Wang wrote: > > > From: Sean Christopherson <seanjc@xxxxxxxxxx> > > > > > > commit c5e2184d1544f9e56140791eff1a351bea2e63b9 upstream > > > > > > Remove the update_pte() shadow paging logic, which was obsoleted by > > > commit 4731d4c7a077 ("KVM: MMU: out of sync shadow core"), but never > > > removed. As pointed out by Yu, KVM never write protects leaf page > > > tables for the purposes of shadow paging, and instead marks their > > > associated shadow page as unsync so that the guest can write PTEs at > > > will. > > > > > > The update_pte() path, which predates the unsync logic, optimizes COW > > > scenarios by refreshing leaf SPTEs when they are written, as opposed to > > > zapping the SPTE, restarting the guest, and installing the new SPTE on > > > the subsequent fault. Since KVM no longer write-protects leaf page > > > tables, update_pte() is unreachable and can be dropped. > > > > > > Reported-by: Yu Zhang <yu.c.zhang@xxxxxxxxx> > > > Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx> > > > Message-Id: <20210115004051.4099250-1-seanjc@xxxxxxxxxx> > > > Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> > > > (jwang: backport to 5.4 to fix a warning on AMD nested Virtualization) > > > Signed-off-by: Jack Wang <jinpu.wang@xxxxxxxxx> > > > --- > > > We hit a warning in WARNING: CPU: 62 PID: 29302 at arch/x86/kvm/mmu.c:2250 nonpaging_update_pte+0x5/0x10 [kvm] > > > on AMD Opteron(tm) Processor 6386 SE with kernel 5.4.113, it seems nested L2 is running, I notice a similar bug > > > report on https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1884058. > > > > > > I did test with kvm-unit-tests on both Intel Broadwell/Skylake, AMD Opteron, no > > > regression, basic VM tests work fine too on 5.4 kernel. > > > the commit c5e2184d1544f9e56140791eff1a351bea2e63b9 can be cherry-picked cleanly > > > to kernel 5.10+. > > Ah, drat. The WARN fires because update_pte() comes from the current MMU context, > and older kernels are missing a check to ensure the current MMU context is > "compatible" with the target shadow page's context. > > That bug was inadvertantly fixed by commit a102a674e423 ("KVM: x86/mmu: Don't drop > level/direct from MMU role calculation"), but I didn't tag that for stable because > I incorrectly thought that adding a check on the "direct" role was a nop. From > that changelog: > > While doing so, stop ignoring "direct". The field is effectively ignored > anyways because kvm_mmu_pte_write() is only reached with an indirect mmu > and the loop only walks indirect shadow pages, but double checking "direct" > literally costs nothing. > > Specifically, the "kvm_mmu_pte_write() is only reached with an indirect mmu" part > fails to account for the case where where the VM has one or more indirect MMUs > _and_ a direct MMU (the current MMU context). > > All that said, ripping out this code works too, and is preferable since the code > is dead for its intended purpose, i.e. reaching the update_pte() code can only > be a random, unwanted collision. > > > Now queued up, thanks. > > Belated thumbs up, thanks! Hi Sean! Thanks for checking and detailed implaination. Regards! Jack Wang