There are two stages of page faults and the stage one page fault is handled by guest itself. The guest is trapped to host when the page fault is caused by stage 2 page table, for example missing. The guest is suspended until the requested page is populated. To populate the requested page can be costly and might be related to IO activities if the page was swapped out previously. In this case, the guest has to suspend for a few of milliseconds at least, regardless of the overall system load. The series adds support to asychornous page fault to improve above situation. If it's costly to populate the requested page, a signal (PAGE_NOT_PRESENT) is sent to guest so that the faulting process can be rescheduled if it can be. Otherwise, it is put into power-saving mode. Another signal (PAGE_READY) is sent to guest once the requested page is populated so that the faulting process can be waken up either from either waiting state or power-saving state. In order to fulfil the control flow and convey signals between host and guest. A IMPDEF system register (SYS_ASYNC_PF_EL1) is introduced. The register accepts control block's physical address, plus requested features. Also, the signal is sent using data abort with the specific IMPDEF Data Fault Status Code (DFSC). The specific signal is stored in the control block by host, to be consumed by guest. Todo ==== * CONFIG_KVM_ASYNC_PF_SYNC is disabled for now because the exception injection can't work in nested mode. It might be something to be improved in future. * KVM_ASYNC_PF_SEND_ALWAYS is disabled even with CONFIG_PREEMPTION because it's simply not working reliably. * Tracepoints, which should something to be done in short term. * kvm-unit-test cases. * More testing and debugging are needed. Sometimes, the guest can be stuck and the root cause needs to be figured out. PATCH[01] renames kvm_vcpu_get_hsr() to kvm_vcpu_get_esr() since the aarch32 host isn't supported. PATCH[02] allows various helper functions to access ESR value from somewhere other than vCPU struct. PATCH[03] replaces @hsr with @esr as aarch32 host isn't supported. PATCH[04] exports kvm_handle_user_mem_abort(), which is used by the subsequent patch. PATCH[05] introduces API to inject data abort with IMPDEF DFSC PATCH[06] supports asynchronous page fault for host PATCH[07] supports asynchronous page fault for guest Testing ======= Start a VM and its QEMU process is put into the specific memory cgroup. The cgroup's memory limitation is less that the total amount of memory assigned to the VM. For example, the VM is assigned with 4GB memory, but the cgroup's limitaton is 2GB. A program is run after VM boots up, to allocate (and access) all free memory. No system hang is found. Gavin Shan (7): kvm/arm64: Rename kvm_vcpu_get_hsr() to kvm_vcpu_get_esr() kvm/arm64: Detach ESR operator from vCPU struct kvm/arm64: Replace hsr with esr kvm/arm64: Export kvm_handle_user_mem_abort() with prefault mode kvm/arm64: Allow inject data abort with specified DFSC kvm/arm64: Support async page fault arm64: Support async page fault arch/arm64/Kconfig | 11 + arch/arm64/include/asm/exception.h | 5 + arch/arm64/include/asm/kvm_emulate.h | 87 +++---- arch/arm64/include/asm/kvm_host.h | 46 ++++ arch/arm64/include/asm/kvm_para.h | 55 +++++ arch/arm64/include/asm/sysreg.h | 3 + arch/arm64/include/uapi/asm/Kbuild | 3 - arch/arm64/include/uapi/asm/kvm_para.h | 22 ++ arch/arm64/kernel/smp.c | 47 ++++ arch/arm64/kvm/Kconfig | 1 + arch/arm64/kvm/Makefile | 2 + arch/arm64/kvm/handle_exit.c | 48 ++-- arch/arm64/kvm/hyp/switch.c | 33 +-- arch/arm64/kvm/hyp/vgic-v2-cpuif-proxy.c | 7 +- arch/arm64/kvm/inject_fault.c | 38 ++- arch/arm64/kvm/sys_regs.c | 91 +++++-- arch/arm64/mm/fault.c | 239 ++++++++++++++++++- virt/kvm/arm/aarch32.c | 27 ++- virt/kvm/arm/arm.c | 36 ++- virt/kvm/arm/async_pf.c | 290 +++++++++++++++++++++++ virt/kvm/arm/hyp/aarch32.c | 4 +- virt/kvm/arm/hyp/vgic-v3-sr.c | 7 +- virt/kvm/arm/mmio.c | 27 ++- virt/kvm/arm/mmu.c | 69 ++++-- 24 files changed, 1040 insertions(+), 158 deletions(-) create mode 100644 arch/arm64/include/asm/kvm_para.h delete mode 100644 arch/arm64/include/uapi/asm/Kbuild create mode 100644 arch/arm64/include/uapi/asm/kvm_para.h create mode 100644 virt/kvm/arm/async_pf.c -- 2.23.0 _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm