On Tue, Oct 12, 2021, Sean Christopherson wrote: > If we are unable to root cause and fix the bug, I think a viable workaround would > be to clear the hardware present bit in unrelated SPTEs, but keep the SPTEs > themselves. The idea mostly the same as the ZAPPED_PRIVATE concept from the initial > TDX RFC. MMU notifier invalidations, memslot removal, RMP restoration, etc... would > all continue to work since the SPTEs is still there, and KVM's page fault handler > could audit any "blocked" SPTE when it's refaulted (I'm pretty sure it'd be > impossible for the PFN to change, since any PFN change would require a memslot > update or mmu_notifier invalidation). > > The downside to that approach is that it would require walking all SPTEs to do a > memslot deletion, i.e. we'd lose the "fast zap" behavior. If that's a performance > issue, the behavior could be opt-in (but not for SNP/TDX). Another option if we introduce private memslots is to preserve private memslots on unrelated deletions. The argument being that (a) private memslots are a new feature so there's no prior uABI to break, and (b) if not zapping private memslot SPTEs in response to the guest remapping a BAR somehow breaks GPU pass-through, then the bug is all but guaranteed to be somewhere besides KVM's memslot logic.