On Fri, Dec 10, 2021, Sean Christopherson wrote: > On Fri, Dec 10, 2021, Paolo Bonzini wrote: > > On 12/10/21 17:01, Sean Christopherson wrote: > > > > KVM_REQ_MMU_RELOAD is raised after kvm->arch.mmu_valid_gen is fixed (of > > > > course, otherwise the other CPU might just not see any obsoleted page > > > > from the legacy MMU), therefore any check on KVM_REQ_MMU_RELOAD is just > > > > advisory. > > > > > > I disagree. IMO, KVM should not be installing SPTEs into obsolete shadow pages, > > > which is what continuing on allows. I don't _think_ it's problematic, but I do > > > think it's wrong. > > > > > > [...] Eh, for all intents and purposes, KVM_REQ_MMU_RELOAD very much says > > > special roots are obsolete. The root will be unloaded, i.e. will no > > > longer be used, i.e. is obsolete. > > > > I understand that---but it takes some unspoken details to understand that. > > Eh, it takes just as many unspoken details to understand why it's safe-ish to > install SPTEs into an obsolete shadow page. > > > In particular that both kvm_reload_remote_mmus and is_page_fault_stale are > > called under mmu_lock write-lock, and that there's no unlock between > > updating mmu_valid_gen and calling kvm_reload_remote_mmus. > > > > (This also suggests, for the other six patches, keeping > > kvm_reload_remote_mmus and just moving it to arch/x86/kvm/mmu/mmu.c, with an > > assertion that the MMU lock is held for write). > > > > But since we have a way forward for having no special roots to worry about, > > it seems an unnecessary overload for 1) a patch that will last one or two > > releasees at most > > Yeah, I don't disagree, which is why I'm not totally opposed to punting this and > naturally fixing it by allocating shadow pages for the special roots. But this > code needs to be modified by Jiangshan's series either way, so it's not like we're > saving anything meaningful. > > > 2) a case that has been handled in the inefficient way forever. > > I don't care about inefficiency, I'm worried about correctness. It's extremely > unlikely this fixes a true bug in the legacy MMU, but there's also no real > downside to adding the check. > > Anyways, either way is fine. Ping, in case this dropped off your radar. Regardless of how we fix this goof, it needs to get fixed in 5.16.