On Wed, Dec 4, 2024 at 3:07 PM Oliver Upton <oliver.upton@xxxxxxxxx> wrote: > > Hi James, Hi Oliver! > > On Wed, Dec 04, 2024 at 07:13:41PM +0000, James Houghton wrote: > > Adhering to the requirements of KVM Userfault: > > > > 1. When it is toggled (either on or off), zap the second stage with > > kvm_arch_flush_shadow_memslot(). This is to (1) respect > > userfault-ness and (2) to reconstruct block mappings. > > 2. While KVM_MEM_USERFAULT is enabled, restrict new second-stage mappings > > to be PAGE_SIZE, just like when dirty logging is enabled. > > > > Signed-off-by: James Houghton <jthoughton@xxxxxxxxxx> > > --- > > I'm not 100% sure if kvm_arch_flush_shadow_memslot() is correct in > > this case (like if the host does not have S2FWB). > > Invalidating the stage-2 entries is of course necessary for correctness > on the !USERFAULT -> USERFAULT transition, and the MMU will do the right > thing regardless of whether hardware implements FEAT_S2FWB. > > What I think you may be getting at is the *performance* implications are > quite worrying without FEAT_S2FWB due to the storm of CMOs, and I'd > definitely agree with that. Thanks for clarifying that for me. > > @@ -2062,6 +2069,20 @@ void kvm_arch_commit_memory_region(struct kvm *kvm, > > enum kvm_mr_change change) > > { > > bool log_dirty_pages = new && new->flags & KVM_MEM_LOG_DIRTY_PAGES; > > + u32 changed_flags = (new ? new->flags : 0) ^ (old ? old->flags : 0); > > + > > + /* > > + * If KVM_MEM_USERFAULT changed, drop all the stage-2 mappings so that > > + * we can (1) respect userfault-ness or (2) create block mappings. > > + */ > > + if ((changed_flags & KVM_MEM_USERFAULT) && change == KVM_MR_FLAGS_ONLY) > > + kvm_arch_flush_shadow_memslot(kvm, old); > > I'd strongly prefer that we make (2) a userspace problem and don't > eagerly invalidate stage-2 mappings on the USERFAULT -> !USERFAULT > change. > > Having implied user-visible behaviors on ioctls is never good, and for > systems without FEAT_S2FWB you might be better off avoiding the unmap in > the first place. > > So, if userspace decides there's a benefit to invalidating the stage-2 > MMU, it can just delete + recreate the memslot. Ok I think that's reasonable. So for USERFAULT -> !USERFAULT, I'll just follow the precedent set by dirty logging. For x86 today, we collapse the mappings, and for arm64 we do not. Is arm64 ever going to support collapsing back to huge mappings after dirty logging is disabled?