Re: [PATCH v4 0/8] KVM paravirt remote flush tlb

Marcelo Tosatti <mtosatti@xxxxxxxxxx> · Thu, 23 Aug 2012 08:48:56 -0300

On Tue, Aug 21, 2012 at 04:55:52PM +0530, Nikunj A. Dadhania wrote:
> Remote flushing api's does a busy wait which is fine in bare-metal
> scenario. But with-in the guest, the vcpus might have been pre-empted
> or blocked. In this scenario, the initator vcpu would end up
> busy-waiting for a long amount of time.
> 
> This was discovered in our gang scheduling test and other way to solve
> this is by para-virtualizing the flush_tlb_others_ipi(now shows up as
> smp_call_function_many after Alex Shi's TLB optimization)
> 
> This patch set implements para-virt flush tlbs making sure that it
> does not wait for vcpus that are sleeping. And all the sleeping vcpus
> flush the tlb on guest enter. Idea was discussed here:
> https://lkml.org/lkml/2012/2/20/157
> 
> This also brings one more dependency for lock-less page walk that is
> performed by get_user_pages_fast(gup_fast). gup_fast disables the
> interrupt and assumes that the pages will not be freed during that
> period. And this was fine as the flush_tlb_others_ipi would wait for
> all the IPI to be processed and return back. With the new approach of
> not waiting for the sleeping vcpus, this assumption is not valid
> anymore. So now HAVE_RCU_TABLE_FREE is used to free the pages. This
> will make sure that all the cpus would atleast process smp_callback
> before the pages are freed.
> 
> Changelog from v3:
> • Add helper for cleaning up vcpu_state information (Marcelo)
> • Fix code for checking vs_page and leaking page refs (Marcelo)
> 
> Changelog from v2:
> • Rebase to 3.5 based linus(commit - f7da9cd) kernel.
> • Port PV-Flush to new TLB-Optimization code by Alex Shi
> • Use pinned pages to avoid overhead during guest enter/exit (Marcelo)
> • Remove kick, as this is not improving much
> • Use bit fields in the state(flush_on_enter and vcpu_running) flag to
>   avoid smp barriers (Marcelo)
> 
> Changelog from v1:
> • Race fixes reported by Vatsa
> • Address gup_fast dependency using PeterZ's rcu table free patch
> • Fix rcu_table_free for hw pagetable walkers
> 
> Here are the results from PLE hardware. Here is the setup details:
> • 32 CPUs (HT disabled)
> • 64-bit VM
>    • 32vcpus
>    • 8GB RAM
> 
> base =  3.6-rc1 + ple handler optimization patch
> pvflushv4 =  3.6-rc1 + ple handler optimization patch + pvflushv4 patch
> 
> kernbench(lower is better)
> ==========================
>          base      pvflushv4      %improvement
> 1VM    48.5800       46.8513       3.55846
> 2VM   108.1823      104.6410       3.27346
> 3VM   183.2733      163.3547      10.86825
> 
> ebizzy(higher is better)
> ========================
>          base         pvflushv4      %improvement
> 1VM     2414.5000     2089.8750     -13.44481
> 2VM     2167.6250     2371.7500      9.41699
> 3VM     1600.1111     2102.5556     31.40060
> 
> Thanks Raghu for running the tests.
> 
> [1] http://article.gmane.org/gmane.linux.kernel/1329752
> 
> ---
> 
> Nikunj A. Dadhania (6):
>       KVM Guest: Add VCPU running/pre-empted state for guest
>       KVM-HV: Add VCPU running/pre-empted state for guest
>       KVM Guest: Add paravirt kvm_flush_tlb_others
>       KVM-HV: Add flush_on_enter before guest enter
>       Enable HAVE_RCU_TABLE_FREE for kvm when PARAVIRT_TLB_FLUSH is enabled
>       KVM-doc: Add paravirt tlb flush document
> 
> Peter Zijlstra (2):
>       mm, x86: Add HAVE_RCU_TABLE_FREE support
>       mm: Add missing TLB invalidate to RCU page-table freeing
> 
> 
>  Documentation/virtual/kvm/msr.txt                |    4 +
>  Documentation/virtual/kvm/paravirt-tlb-flush.txt |   53 ++++++++++++++
>  arch/Kconfig                                     |    3 +
>  arch/powerpc/Kconfig                             |    1 
>  arch/sparc/Kconfig                               |    1 
>  arch/x86/Kconfig                                 |   11 +++
>  arch/x86/include/asm/kvm_host.h                  |    7 ++
>  arch/x86/include/asm/kvm_para.h                  |   13 +++
>  arch/x86/include/asm/tlb.h                       |    1 
>  arch/x86/include/asm/tlbflush.h                  |   11 +++
>  arch/x86/kernel/kvm.c                            |   38 ++++++++++
>  arch/x86/kvm/cpuid.c                             |    1 
>  arch/x86/kvm/x86.c                               |   84 +++++++++++++++++++++-
>  arch/x86/mm/pgtable.c                            |    6 +-
>  arch/x86/mm/tlb.c                                |   36 +++++++++
>  include/asm-generic/tlb.h                        |    9 ++
>  mm/memory.c                                      |   43 ++++++++++-
>  17 files changed, 311 insertions(+), 11 deletions(-)
>  create mode 100644 Documentation/virtual/kvm/paravirt-tlb-flush.txt

Avi, PeterZ can you please review?

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html