Hi James, On Wed, Dec 04, 2024 at 07:13:41PM +0000, James Houghton wrote: > Adhering to the requirements of KVM Userfault: > > 1. When it is toggled (either on or off), zap the second stage with > kvm_arch_flush_shadow_memslot(). This is to (1) respect > userfault-ness and (2) to reconstruct block mappings. > 2. While KVM_MEM_USERFAULT is enabled, restrict new second-stage mappings > to be PAGE_SIZE, just like when dirty logging is enabled. > > Signed-off-by: James Houghton <jthoughton@xxxxxxxxxx> > --- > I'm not 100% sure if kvm_arch_flush_shadow_memslot() is correct in > this case (like if the host does not have S2FWB). Invalidating the stage-2 entries is of course necessary for correctness on the !USERFAULT -> USERFAULT transition, and the MMU will do the right thing regardless of whether hardware implements FEAT_S2FWB. What I think you may be getting at is the *performance* implications are quite worrying without FEAT_S2FWB due to the storm of CMOs, and I'd definitely agree with that. > @@ -2062,6 +2069,20 @@ void kvm_arch_commit_memory_region(struct kvm *kvm, > enum kvm_mr_change change) > { > bool log_dirty_pages = new && new->flags & KVM_MEM_LOG_DIRTY_PAGES; > + u32 changed_flags = (new ? new->flags : 0) ^ (old ? old->flags : 0); > + > + /* > + * If KVM_MEM_USERFAULT changed, drop all the stage-2 mappings so that > + * we can (1) respect userfault-ness or (2) create block mappings. > + */ > + if ((changed_flags & KVM_MEM_USERFAULT) && change == KVM_MR_FLAGS_ONLY) > + kvm_arch_flush_shadow_memslot(kvm, old); I'd strongly prefer that we make (2) a userspace problem and don't eagerly invalidate stage-2 mappings on the USERFAULT -> !USERFAULT change. Having implied user-visible behaviors on ioctls is never good, and for systems without FEAT_S2FWB you might be better off avoiding the unmap in the first place. So, if userspace decides there's a benefit to invalidating the stage-2 MMU, it can just delete + recreate the memslot. -- Thanks, Oliver