2017-11-16 5:05 GMT+08:00 Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>: > On Mon, Nov 13, 2017 at 02:01:16AM -0800, Wanpeng Li wrote: >> Remote flushing api's does a busy wait which is fine in bare-metal >> scenario. But with-in the guest, the vcpus might have been pre-empted >> or blocked. In this scenario, the initator vcpu would end up >> busy-waiting for a long amount of time. >> >> This patch set implements para-virt flush tlbs making sure that it >> does not wait for vcpus that are sleeping. And all the sleeping vcpus >> flush the tlb on guest enter. Idea was discussed here: >> https://lkml.org/lkml/2012/2/20/157 >> >> The best result is achieved when we're overcommiting the host by running >> multiple vCPUs on each pCPU. In this case PV tlb flush avoids touching >> vCPUs which are not scheduled and avoid the wait on the main CPU. >> >> In addition, thanks for commit 9e52fc2b50d ("x86/mm: Enable RCU based >> page table freeing (CONFIG_HAVE_RCU_TABLE_FREE=y)") >> >> Test on a Haswell i7 desktop 4 cores (2HT), so 8 pCPUs, running ebizzy >> in one linux guest. > > 8 pCPUS? >> >> ebizzy -M >> vanilla optimized boost >> 8 vCPUs 10152 10083 -0.68% >> 16 vCPUs 1224 4866 297.5% >> 24 vCPUs 1109 3871 249% >> 32 vCPUs 1025 3375 229.3% > > so this is all just one guest? What happens if you have say a 64pCPU > machine with eight of these guests? That is more of a realistic > workload in todays cloud situations. Yeah, testing on a Xeon Gold 6142 2.6GHz 2 socket, each 16 cores (each 2 HTs), so 64 pCPUs, and each VM is 64 vCPUs. vanilla optimized boost 1VM 46799 46788 -0.01% 2VM 23962 42691 78% 3VM 16152 37539 132% Regards, Wanpeng Li > >> >> Note: The patchset is rebased against "locking/qspinlock/x86: Avoid >> test-and-set when PV_DEDICATED is set" v3 >> >> v4 -> v5: >> * flushmask instead of cpumask >> >> v3 -> v4: >> * use READ_ONCE() >> * use try_cmpxchg instead of cmpxchg >> * add {} to if >> * no FLUSH flags to preserve during set_preempted >> * "KVM: X86" prefix to patch subject >> >> v2 -> v3: >> * percpu cpumask >> >> v1 -> v2: >> * a new CPUID feature bit >> * fix cmpxchg check >> * use kvm_vcpu_flush_tlb() to get the statistics right >> * just OR the KVM_VCPU_PREEMPTED in kvm_steal_time_set_preempted >> * add a new bool argument to kvm_x86_ops->tlb_flush >> * __cpumask_clear_cpu() instead of cpumask_clear_cpu() >> * not put cpumask_t on stack >> * rebase the patchset against "locking/qspinlock/x86: Avoid >> test-and-set when PV_DEDICATED is set" v3 >> >> Wanpeng Li (4): >> KVM: X86: Add vCPU running/preempted state >> KVM: X86: Add paravirt remote TLB flush >> KVM: X86: introduce invalidate_gpa argument to tlb flush >> KVM: X86: Add flush_on_enter before guest enter >> >> Documentation/virtual/kvm/cpuid.txt | 4 ++++ >> arch/x86/include/asm/kvm_host.h | 2 +- >> arch/x86/include/uapi/asm/kvm_para.h | 6 +++++ >> arch/x86/kernel/kvm.c | 46 ++++++++++++++++++++++++++++++++++-- >> arch/x86/kvm/cpuid.c | 3 ++- >> arch/x86/kvm/svm.c | 14 +++++------ >> arch/x86/kvm/vmx.c | 21 ++++++++-------- >> arch/x86/kvm/x86.c | 25 +++++++++++++------- >> 8 files changed, 88 insertions(+), 30 deletions(-) >> >> -- >> 2.7.4 >>