On Tue, Oct 1, 2024 at 3:17 PM David Matlack <dmatlack@xxxxxxxxxx> wrote: > > On 2024-09-13 02:43 PM, Vipin Sharma wrote: > > @@ -6997,13 +7007,50 @@ void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm, u64 gen) > > static unsigned long mmu_shrink_scan(struct shrinker *shrink, > > struct shrink_control *sc) > > { > > - return SHRINK_STOP; > > + struct kvm *kvm, *next_kvm, *first_kvm = NULL; > > + unsigned long i, freed = 0; > > + struct kvm_vcpu *vcpu; > > + > > + mutex_lock(&kvm_lock); > > + list_for_each_entry_safe(kvm, next_kvm, &vm_list, vm_list) { > > + if (!first_kvm) > > + first_kvm = kvm; > > + else if (first_kvm == kvm) > > + break; > > + > > + list_move_tail(&kvm->vm_list, &vm_list); > > + > > + kvm_for_each_vcpu(i, vcpu, kvm) { > > + if (!mutex_trylock(&vcpu->arch.mmu_memory_cache_lock)) > > + continue; > > + freed += kvm_mmu_empty_memory_cache(&vcpu->arch.mmu_shadow_page_cache); > > + freed += kvm_mmu_empty_memory_cache(&vcpu->arch.mmu_shadowed_info_cache); > > + mutex_unlock(&vcpu->arch.mmu_memory_cache_lock); > > + if (freed >= sc->nr_to_scan) > > + goto out; > > Looking at the caller in mm/shrinker.c, sc->nr_to_scan will be <= 128 > (SHRINK_BATCH), which is only enough for 2 vCPUs. So I think the > shrinker will only ever free 2 vCPU caches of each VM (probably the > first 2 vCPUs) before reordering the list and moving onto the next VM on > the next call. > > Does that match the behavior you observe? > Yes, for dropping cache one time on a big VM, I get multiple calls of mmu_shrink_scan() where sc->nr_to_scan is at max 128 in each call. mmu_memory_cache_lock availability will play a role in selecting the two vCPUs. On a VM where not much faults are happening it will probably be the first two vCPUs.