On Tue, Mar 23, 2021, Ben Gardon wrote: > On Mon, Mar 22, 2021 at 5:15 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > > > On Mon, Mar 22, 2021, Ben Gardon wrote: > > > It could be fixed by forbidding kvm_tdp_mmu_zap_gfn_range from > > > yielding. Since we should only need to zap one SPTE, the yield should > > > not be needed within the kvm_tdp_mmu_zap_gfn_range call. To ensure > > > that only one SPTE is zapped we would have to specify the root though. > > > Otherwise we could end up zapping all the entries for the same GFN > > > range under an unrelated root. > > > > Hmm, I originally did exactly that, but changed my mind because this zaps far > > more than 1 SPTE. This is zapping a SP that could be huge, but is not, which > > means it's guaranteed to have a non-zero number of child SPTEs. The worst case > > scenario is that SP is a PUD (potential 1gb page) and the leafs are 4k SPTEs. > > It's true that there are potentially 512^2 child sptes, but the code > to clear those after the single PUD spte is cleared doesn't yield > anyway. If the TDP MMU is only operating with one root (as we would > expect in most cases), there should only be one chance for it to > yield. Ah, right, I was thinking all the iterative flows yielded. Disallowing kvm_tdp_mmu_zap_gfn_range() from yielding in this case does seem like the best fix. Any objection to me sending v2 with that? > I've considered how we could allow the recursive changed spte handlers > to yield, but it gets complicated quite fast because the caller needs > to know if it yielded and reset the TDP iterator to the root, and > there are some cases (mmu notifiers + vCPU path) where yielding is not > desirable. Urgh, yeah, seems like we'd quickly end up with a mess resembling the legacy MMU iterators. > > > > But, I didn't consider the interplay between invalid_list and the TDP MMU > > yielding. Hrm.