On Fri, Apr 03, 2015 at 03:27:51PM +0800, Xiao Guangrong wrote: > > >On 04/03/2015 02:09 PM, Wanpeng Li wrote: >>There are two scenarios for the requirement of collapsing small sptes >>into large sptes. >>- dirty logging tracks sptes in 4k granularity, so large sptes are split, >> the large sptes will be reallocated in the destination machine and the >> guest in the source machine will be destroyed when live migration successfully. >> However, the guest in the source machine will continue to run if live migration >> fail due to some reasons, the sptes still keep small which lead to bad >> performance. >>- our customers write tools to track the dirty speed of guests by EPT D bit/PML >> in order to determine the most appropriate one to be live migrated, however >> sptes will still keep small after tracking dirty speed. >> >>This patch introduce lazy collapse small sptes into large sptes, the memory region >>will be scanned on the ioctl context when dirty log is stopped, the ones which can >>be collapsed into large pages will be dropped during the scan, it depends the on >>later #PF to reallocate all large sptes. >> >>Signed-off-by: Wanpeng Li <wanpeng.li@xxxxxxxxxxxxxxx> >>--- >>v1 -> v2: >> * use 'bool' instead of 'int' >> * add more comments >> * fix can not get the next spte after drop the current spte >> >> arch/x86/include/asm/kvm_host.h | 2 ++ >> arch/x86/kvm/mmu.c | 71 +++++++++++++++++++++++++++++++++++++++++ >> arch/x86/kvm/x86.c | 19 +++++++++++ >> 3 files changed, 92 insertions(+) >> >>diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h >>index 30b28dc..91b5bdb 100644 >>--- a/arch/x86/include/asm/kvm_host.h >>+++ b/arch/x86/include/asm/kvm_host.h >>@@ -854,6 +854,8 @@ void kvm_mmu_set_mask_ptes(u64 user_mask, u64 accessed_mask, >> void kvm_mmu_reset_context(struct kvm_vcpu *vcpu); >> void kvm_mmu_slot_remove_write_access(struct kvm *kvm, >> struct kvm_memory_slot *memslot); >>+void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm, >>+ struct kvm_memory_slot *memslot); >> void kvm_mmu_slot_leaf_clear_dirty(struct kvm *kvm, >> struct kvm_memory_slot *memslot); >> void kvm_mmu_slot_largepage_remove_write_access(struct kvm *kvm, >>diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c >>index cee7592..df3f2e3 100644 >>--- a/arch/x86/kvm/mmu.c >>+++ b/arch/x86/kvm/mmu.c >>@@ -4465,6 +4465,77 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm, >> kvm_flush_remote_tlbs(kvm); >> } >> >>+static bool kvm_mmu_zap_collapsible_spte(struct kvm *kvm, >>+ unsigned long *rmapp) >>+{ >>+ u64 *sptep; >>+ struct rmap_iterator iter; >>+ int need_tlb_flush = 0; >>+ pfn_t pfn; >>+ struct kvm_mmu_page *sp; >>+ >>+ while ((sptep = rmap_get_first(*rmapp, &iter))) { >>+ BUG_ON(!(*sptep & PT_PRESENT_MASK)); >>+ >>+ sp = page_header(__pa(sptep)); >>+ pfn = spte_to_pfn(*sptep); >>+ >>+ /* >>+ * Let support EPT only now, an efficient way need to be figure >>+ * out to let these code be aware what mapping level used in >>+ * guest. > >This English seems strange... but i am not good at it. :) I'm also not good at English, anyway, update it in v3. :) > >>+ */ >>+ if (sp->role.direct && >>+ !kvm_is_reserved_pfn(pfn) && >>+ PageTransCompound(pfn_to_page(pfn))) { >>+ drop_spte(kvm, sptep); >>+ need_tlb_flush = 1; >>+ } > >If the conditions are not comfortable, it does loop forever... Fix it in v3. > >Otherwise, it looks good to me. > >Reviewed-by: Xiao Guangrong <guangrong.xiao@xxxxxxxxxxxxxxx> Thanks for your review. :) Regards, Wanpeng Li > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html