On Thu, Mar 25, 2021 at 3:25 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > On Thu, Mar 25, 2021, Ben Gardon wrote: > > On Thu, Mar 25, 2021 at 1:01 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > > +static inline bool kvm_tdp_mmu_zap_gfn_range(struct kvm *kvm, gfn_t start, > > > + gfn_t end) > > > +{ > > > + return __kvm_tdp_mmu_zap_gfn_range(kvm, start, end, true); > > > +} > > > +static inline bool kvm_tdp_mmu_zap_sp(struct kvm *kvm, struct kvm_mmu_page *sp) > > > > I'm a little leary of adding an interface which takes a non-root > > struct kvm_mmu_page as an argument to the TDP MMU. > > In the TDP MMU, the struct kvm_mmu_pages are protected rather subtly. > > I agree this is safe because we hold the MMU lock in write mode here, > > but if we ever wanted to convert to holding it in read mode things > > could get complicated fast. > > Maybe this is more of a concern if the function started to be used > > elsewhere since NX recovery is already so dependent on the write lock. > > Agreed. Even writing the comment below felt a bit awkward when thinking about > additional users holding mmu_lock for read. Actually, I should remove that > specific blurb since zapping currently requires holding mmu_lock for write. > > > Ideally though, NX reclaim could use MMU read lock + > > tdp_mmu_pages_lock to protect the list and do reclaim in parallel with > > everything else. > > Yar, processing all legacy MMU pages, and then all TDP MMU pages to avoid some > of these dependencies crossed my mind. But, it's hard to justify effectively > walking the list twice. And maintaining two lists might lead to balancing > issues, e.g. the legacy MMU and thus nested VMs get zapped more often than the > TDP MMU, or vice versa. I think in an earlier version of the TDP that I sent out, NX reclaim was a seperate thread for the two MMUs, sidestepping the balance issue. I think the TDP MMU also had a seperate NX reclaim list. That would also make it easier to do something under the read lock. > > > The nice thing about drawing the TDP MMU interface in terms of GFNs > > and address space IDs instead of SPs is that it doesn't put > > constraints on the implementation of the TDP MMU because those GFNs > > are always going to be valid / don't require any shared memory. > > This is kind of innocuous because it's immediately converted into that > > gfn interface, so I don't know how much it really matters. > > > > In any case this change looks correct and I don't want to hold up > > progress with bikeshedding. > > WDYT? > > I think we're kind of hosed either way. Either we add a helper in the TDP MMU > that takes a SP, or we bleed a lot of information about the details of TDP MMU > into the common MMU. E.g. the function could be open-coded verbatim, but the > whole comment below, and the motivation for not feeding in flush is very > dependent on the internal details of TDP MMU. > > I don't have a super strong preference. One thought would be to assert that > mmu_lock is held for write, and then it largely come future person's problem :-) Yeah, I agree and I'm happy to kick this proverbial can down the road until we actually add an NX reclaim implementation that uses the MMU read lock. > > > > +{ > > > + gfn_t end = sp->gfn + KVM_PAGES_PER_HPAGE(sp->role.level); > > > + > > > + /* > > > + * Don't allow yielding, as the caller may have a flush pending. Note, > > > + * if mmu_lock is held for write, zapping will never yield in this case, > > > + * but explicitly disallow it for safety. The TDP MMU does not yield > > > + * until it has made forward progress (steps sideways), and when zapping > > > + * a single shadow page that it's guaranteed to see (thus the mmu_lock > > > + * requirement), its "step sideways" will always step beyond the bounds > > > + * of the shadow page's gfn range and stop iterating before yielding. > > > + */ > > > + return __kvm_tdp_mmu_zap_gfn_range(kvm, sp->gfn, end, false); > > > +} > > > void kvm_tdp_mmu_zap_all(struct kvm *kvm); > > > > > > int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, > > > -- > > > 2.31.0.291.g576ba9dcdaf-goog > > >