On Thu, Jan 13, 2022, David Matlack wrote: > On Thu, Jan 13, 2022 at 9:04 AM David Matlack <dmatlack@xxxxxxxxxx> wrote: > > > > On Wed, Jan 12, 2022 at 4:29 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > > > Oh interesting. I actually find that confusing because it can easily > > > > lead to the MMU-writable bit staying set. Here we are protecting GFNs > > > > and we're opting to leave the MMU-writable bit set. It takes a lot of > > > > digging to figure out that this is safe because if MMU-writable is set > > > > and the SPTE cannot be locklessly be made writable then it implies > > > > Host-writable is clear, and Host-writable can't be reset without > > > > syncing the all shadow pages reachable by the MMU. Oh and the > > > > MMU-writable bit is never consulted on its own (e.g. We never iterate > > > > through all SPTEs to find the ones that are !MMU-writable). > > > > > > Ah, you've missed the other wrinkle: MMU-writable can bet set iff Host-writable > > > is set. In other words, the MMU-writable bit is never left set because it can't > > > be set if spte_can_locklessly_be_made_writable() returns false. > > The changed_pte notifier looks like it clears Host-writable without > clearing MMU-writable. Specifically the call chain: > > kvm_mmu_notifier_change_pte() > kvm_set_spte_gfn() > kvm_tdp_mmu_set_spte_gfn() > set_spte_gfn() > kvm_mmu_changed_pte_notifier_make_spte() > > Is there some guarantee that old_spte is !MMU-writable at this point? Ugh, I misread that code, multiple times. There's no guarantee, it was likely just missed when MMU-writable was introduced. Note, you literally cannot hit that code path in current kernels. See commit c13fda237f08 ("KVM: Assert that notifier count is elevated in .change_pte()"). So whatever you do is effectively untestable. I really want to rip out .change_pte(), but I also don't want to do any performance testing to justify removing the code instead of fixing it proper, so it's hung around as a zombie... > If not I could easily change kvm_mmu_changed_pte_notifier_make_spte() > to also clear MMU-writable and preserve the invariant. Yes, please.