[Patch v4 00/18] NUMA aware page table allocation

Vipin Sharma <vipinsh@xxxxxxxxxx> · Mon, 6 Mar 2023 14:41:09 -0800

Hi,

This series build up based on the feedback on v3.

Biggest change in features is to enable NUMA aware page table per VM
basis instead of using a module parameter for all VMs on a host. This
was decided based on an internal discussion to avoid forcing all VMs to
be NUMA aware on a host. We need to collect more data to see how much
performance degradation a VM can get in negative testing, where vCPUs in
VM are always accessing remote NUMA nodes memory instead of staying
local compared to a VM which is not NUMA aware.

There are other changes which are mentioned in the change log below for
v4.

Thanks
Vipin

v4:
- Removed module parameter for enabling NUMA aware page table.
- Added new capability KVM_CAP_NUMA_AWARE_PAGE_TABLE to enable this
  feature per VM.
- Added documentation for the new capability.
- Holding mutex just before the top up and releasing it after the
  fault/split is addressed. Previous version were using spinlocks two
  times, first time for topup and second time fetching the page from
  cache.
- Using the existing slots_lock for split_shadow_page_cache operations.
- KVM MMU shrinker will also shrink mm_shadow_info_cache besides
  split_shadow_page_cache and mmu_shadow_page_cache.
- Reduced cache default size to 4.
- Split patches into smaller ones.

v3: https://lore.kernel.org/lkml/20221222023457.1764-1-vipinsh@xxxxxxxxxx/
- Split patches into smaller ones.
- Repurposed KVM MMU shrinker to free cache pages instead of oldest page table
  pages
- Reduced cache size from 40 to 5
- Removed __weak function and initializing node value in all architectures.
- Some name changes.

v2: https://lore.kernel.org/lkml/20221201195718.1409782-1-vipinsh@xxxxxxxxxx/
- All page table pages will be allocated on underlying physical page's
  NUMA node.
- Introduced module parameter, numa_aware_pagetable, to disable this
  feature.
- Using kvm_pfn_to_refcounted_page to get page from a pfn.

v1: https://lore.kernel.org/all/20220801151928.270380-1-vipinsh@xxxxxxxxxx/

Vipin Sharma (18):
  KVM: x86/mmu: Change KVM mmu shrinker to no-op
  KVM: x86/mmu: Remove zapped_obsolete_pages from struct kvm_arch{}
  KVM: x86/mmu: Track count of pages in KVM MMU page caches globally
  KVM: x86/mmu: Shrink shadow page caches via MMU shrinker
  KVM: x86/mmu: Add split_shadow_page_cache pages to global count of MMU
    cache pages
  KVM: x86/mmu: Shrink split_shadow_page_cache via MMU shrinker
  KVM: x86/mmu: Unconditionally count allocations from MMU page caches
  KVM: x86/mmu: Track unused mmu_shadowed_info_cache pages count via
    global counter
  KVM: x86/mmu: Shrink mmu_shadowed_info_cache via MMU shrinker
  KVM: x86/mmu: Add per VM NUMA aware page table capability
  KVM: x86/mmu: Add documentation of NUMA aware page table capability
  KVM: x86/mmu: Allocate NUMA aware page tables on TDP huge page splits
  KVM: mmu: Add common initialization logic for struct
    kvm_mmu_memory_cache{}
  KVM: mmu: Initialize kvm_mmu_memory_cache.gfp_zero to __GFP_ZERO by
    default
  KVM: mmu: Add NUMA node support in struct kvm_mmu_memory_cache{}
  KVM: x86/mmu: Allocate numa aware page tables during page fault
  KVM: x86/mmu: Allocate shadow mmu page table on huge page split on the
    same NUMA node
  KVM: x86/mmu: Reduce default mmu memory cache size

 Documentation/virt/kvm/api.rst   |  29 +++
 arch/arm64/kvm/arm.c             |   2 +-
 arch/arm64/kvm/mmu.c             |   2 +-
 arch/mips/kvm/mips.c             |   3 +
 arch/riscv/kvm/mmu.c             |   8 +-
 arch/riscv/kvm/vcpu.c            |   2 +-
 arch/x86/include/asm/kvm_host.h  |  17 +-
 arch/x86/include/asm/kvm_types.h |   6 +-
 arch/x86/kvm/mmu/mmu.c           | 319 +++++++++++++++++++------------
 arch/x86/kvm/mmu/mmu_internal.h  |  38 ++++
 arch/x86/kvm/mmu/paging_tmpl.h   |  29 +--
 arch/x86/kvm/mmu/tdp_mmu.c       |  23 ++-
 arch/x86/kvm/x86.c               |  18 +-
 include/linux/kvm_host.h         |   2 +
 include/linux/kvm_types.h        |  21 ++
 include/uapi/linux/kvm.h         |   1 +
 virt/kvm/kvm_main.c              |  24 ++-
 17 files changed, 386 insertions(+), 158 deletions(-)

-- 
2.40.0.rc0.216.gc4246ad0f0-goog