On Wed, Aug 21, 2019 at 01:35:04PM -0600, Alex Williamson wrote: > On Wed, 21 Aug 2019 13:08:59 -0600 > Alex Williamson <alex.williamson@xxxxxxxxxx> wrote: > > Does this suggests something is still fundamentally wrong with the > > premise of this change or have I done something stupid? > > Seems the latter, particularly your comment that we're looking for > pages pointing to the gfn range to be removed, not just those in the > range. Slot gfn ranges like ffe00-ffe1f are getting reduced to 0-0 or > c0000-c0000, zapping zero or c0000, and I think one of the ones you > were looking for c1080-c1083 is reduce to c1000-c1000 and therefore > zaps sp->gfn c1000. I'll keep looking. Thanks, Ya. As far as where to look, at this point I don't think it's an issue of incorrect zapping. Not because I'm 100% confident the zapping logic is correct, but because many of the tests, e.g. removing 'sp->gfn != gfn' and not being able to exclude APIC/IOAPIC ranges, suggest that the badness is 'fixed' by zapping seemingly unrelated sps. In other words, it may be fundamentally wrong to zap only the memslot being removed, but I really want to know why. History isn't helpful as KVM has always zapped all pages when removing a memslot (on x86), and the introduction of the per-memslot flush hook in commit 2df72e9bc4c5 ("KVM: split kvm_arch_flush_shadow") was all about refactoring generic code, and doesn't have any information on whether per-memslot flushing was actually tried for x86.