Re: [PATCH 0/3] KVM: arm64: nv: Fixes for Nested Virtualization issues

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Hi Marc,

On 24-08-2022 11:33 am, Ganapatrao Kulkarni wrote:
This series contains 3 fixes which were found while testing
ARM64 Nested Virtualization patch series.

First patch avoids the restart of hrtimer when timer interrupt is
fired/forwarded to Guest-Hypervisor.

Second patch fixes the vtimer interrupt drop from the Guest-Hypervisor.

Third patch fixes the NestedVM boot hang seen when Guest Hypersior
configured with 64K pagesize where as Host Hypervisor with 4K.

These patches are rebased on Nested Virtualization V6 patchset[1].

If I boot a Guest Hypervisor with more cores and then booting of a NestedVM with equal number of cores or booting multiple NestedVMs(simultaneously) with lower number of cores is resulting in very slow booting and some time RCU soft-lockup of a NestedVM. This I have debugged and turned out to be due to many SGI are getting asserted to all vCPUs of a Guest-Hypervisor when Guest-Hypervisor KVM code prepares NestedVM for WFI wakeup/return.

When Guest Hypervisor prepares NestedVM while returning/resuming from WFI, it is loading guest-context, vGIC and timer contexts etc. The function gic_poke_irq (called from irq_set_irqchip_state with spinlock held) writes to register GICD_ISACTIVER in Guest-Hypervisor's KVM code resulting in mem-abort trap to Host Hypervisor. Host Hypervisor as part of handling the guest mem abort, function io_mem_abort is called in turn vgic_mmio_write_sactive, which prepares every vCPU of Guest Hypervisor by calling SGI. The number of SGI/IPI calls goes exponentially high when more and more cores are used to boot Guest Hypervisor.

Code trace:
At Guest-hypervisor: kvm_timer_vcpu_load->kvm_timer_vcpu_load_gic->set_timer_irq_phys_active->
irq_set_irqchip_state->gic_poke_irq

At Host-Hypervisor: io_mem_abort-> kvm_io_bus_write->__kvm_io_bus_write->dispatch_mmio_write->
vgic_mmio_write_sactive->vgic_access_active_prepare->
kvm_kick_many_cpus->smp_call_function_many

I am currently working around this with "nohlt" kernel param to NestedVM. Any suggestions to handle/fix this case/issue and avoid the slowness of booting of NestedVM with more cores?

Note: Guest-Hypervisor and NestedVM are using default kernel installed using Fedora 36 iso.


[1] https://www.spinics.net/lists/kvm/msg265656.html

D Scott Phillips (1):
   KVM: arm64: nv: only emulate timers that have not yet fired

Ganapatrao Kulkarni (2):
   KVM: arm64: nv: Emulate ISTATUS when emulated timers are fired.
   KVM: arm64: nv: Avoid block mapping if max_map_size is smaller than
     block size.

  arch/arm64/kvm/arch_timer.c | 8 +++++++-
  arch/arm64/kvm/mmu.c        | 2 +-
  2 files changed, 8 insertions(+), 2 deletions(-)


Thanks,
Ganapat



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux