On Sat, Apr 6, 2024 at 7:26 PM maobibo <maobibo@xxxxxxxxxxx> wrote: > > > > On 2024/4/5 上午1:10, Sean Christopherson wrote: > > On Thu, Apr 04, 2024, David Matlack wrote: > >> On Tue, Apr 2, 2024 at 6:50 PM maobibo <maobibo@xxxxxxxxxxx> wrote: > >>>> This change eliminates dips in guest performance during live migration > >>>> in a 160 vCPU VM when userspace is issuing CLEAR ioctls (tested with > >>>> 1GiB and 8GiB CLEARs). Userspace could issue finer-grained CLEARs, which > >>> Frequently drop/reacquire mmu_lock will cause userspace migration > >>> process issuing CLEAR ioctls to contend with 160 vCPU, migration speed > >>> maybe become slower. > >> > >> In practice we have not found this to be the case. With this patch > >> applied we see a significant improvement in guest workload throughput > >> while userspace is issuing CLEAR ioctls without any change to the > >> overall migration duration. > > > > ... > > > >> In the case of this patch, there doesn't seem to be a trade-off. We > >> see an improvement to vCPU performance without any regression in > >> migration duration or other metrics. > > > > For x86. We need to keep in mind that not all architectures have x86's optimization > > around dirty logging faults, or around faults in general. E.g. LoongArch's (which > > I assume is Bibo Mao's primary interest) kvm_map_page_fast() still acquires mmu_lock. > > And if the fault can't be handled in the fast path, KVM will actually acquire > > mmu_lock twice (mmu_lock is dropped after the fast-path, then reacquired after > > the mmu_seq and fault-in pfn stuff). > yes, there is no lock in function fast_page_fault on x86, however mmu > spinlock is held on fast path on LoongArch. I do not notice this, > originally I think that there is read_lock on x86 fast path for pte > modification, write lock about page table modification. Most[*] vCPU faults on KVM/x86 are handled as follows: - vCPU write-protection and access-tracking faults are handled locklessly (fast_page_fault()). - All other vCPU faults are handled under read_lock(&kvm->mmu_lock). [*] The locking is different when nested virtualization is in use, TDP (i.e. stage-2 hardware) is disabled, and/or kvm.tdp_mmu=N. > > > > So for x86, I think we can comfortably state that this change is a net positive > > for all scenarios. But for other architectures, that might not be the case. > > I'm not saying this isn't a good change for other architectures, just that we > > don't have relevant data to really know for sure. > From the code there is contention between migration assistant thread > and vcpu thread in slow path where huge page need be split if memslot is > dirty log enabled, however there is no contention between migration > assistant thread and vcpu fault fast path. If it is right, negative > effect is small on x86. Right there is no contention between CLEAR_DIRTY_LOG and vCPU write-protection faults on KVM/x86. KVM/arm64 does write-protection faults under the read-lock. KVM/x86 and KVM/arm64 also both have eager page splitting, where the huge pages are split during CLEAR_DIRTY_LOG, so there are also no vCPU faults to split huge pages. > > > > > Absent performance data for other architectures, which is likely going to be > > difficult/slow to get, it might make sense to have this be opt-in to start. We > > could even do it with minimal #ifdeffery, e.g. something like the below would allow > > x86 to do whatever locking it wants in kvm_arch_mmu_enable_log_dirty_pt_masked() > > (I assume we want to give kvm_get_dirty_log_protect() similar treatment?). > > > > I don't love the idea of adding more arch specific MMU behavior (going the wrong > > direction), but it doesn't seem like an unreasonable approach in this case. > No special modification for this, it is a little strange. LoongArch page > fault fast path can improve later. Sorry, I don't follow exactly what you mean here. Are you saying Sean's patch is not required for LoongArch and, instead, LoongArch should/will avoid acquiring the mmu_lock when handling write-protection faults?