On Fri, Feb 16, 2024 at 12:00 PM Anish Moorthy <amoorthy@xxxxxxxxxx> wrote: > > On Thu, Feb 15, 2024 at 11:36 PM Gupta, Pankaj <pankaj.gupta@xxxxxxx> wrote: > > > > On 2/16/2024 12:53 AM, Anish Moorthy wrote: > > > This series adds an option to cause stage-2 fault handlers to > > > KVM_MEMORY_FAULT_EXIT when they would otherwise be required to fault in > > > the userspace mappings. Doing so allows userspace to receive stage-2 > > > faults directly from KVM_RUN instead of through userfaultfd, which > > > suffers from serious contention issues as the number of vCPUs scales. > > > > Thanks for your work! > > :D > > > > > So, this is an alternative approach userspace like Qemu to do post copy > > live migration using KVM_MEMORY_FAULT_EXIT instead of userfaultfd which > > seems slower with more vCPU's. > > > > Maybe I am missing some things here, just curious how userspace VMM e.g > > Qemu would do memory copy with this approach once the page is available > > from remote host which was done with UFFDIO_COPY earlier? > > This new capability is meant to be used *alongside* userfaultfd during > post-copy: it's not a replacement. KVM_RUN can generate page faults > from outside the stage-2 fault handlers (IIUC instruction emulation is > one source), and these paths are unchanged: so it's important that > userspace still UFFDIO_REGISTERs KVM's mapping and reads from the UFFD > to catch these guest accesses. But with the new > KVM_MEM_EXIT_ON_MISSING memslot flag set, the stage-2 handlers will > report needing to fault in memory via KVM_MEMORY_FAULT_EXIT instead of > queuing onto the UFFD. > > In the workloads I've tested, the vast majority of guest-generated > page faults (99%+) come from the stage-2 handlers. So this series > "solves" the issue of contention on the UFFD file descriptor by > (mostly) sidestepping it. > > As for how userspace actually uses the new functionality: when a vCPU > thread receives a KVM_MEMORY_FAULT_EXIT for an unfetched page during > post-copy it might > > (a) Fetch the page > (b) Install the page into KVM's mapping via UFFDIO_COPY (don't > necessarily need to UFFDIO_WAKE!) > (c) Call KVM_RUN to re-enter the guest and retry the access. The > stage-2 fault handler will fire again but almost certainly won't > KVM_MEMORY_FAULT_EXIT now (since the UFFDIO_COPY will have mapped the > page), so the guest can continue. > > and userspace can continue using some thread(s) to > > (a) Read page faults from the UFFD. > (b) Install the page using UFFDIO_COPY + UFFDIO_WAKE > (c) goto (a) > > to make sure it catches everything. The combination of these two things > adds up to more performant "uffd-based" postcopy. > > I'm of course skimming over some details (e.g.: when two vCPU threads > race to fetch a page one of them should probably MADV_POPULATE_WRITE > somehow), but I hope this is helpful. My patch to the KVM demand > paging self test might also clarify things a bit [1]. One other small detail is, you can equally use UFFDIO_CONTINUE, depending on how the rest of the live migration implementation works. Really briefly, this series should be viewed as an alternate (and more scalable) mechanism to find out that a fault occurred. The way userspace then *resolves* the fault (whether via UFFDIO_COPY or UFFDIO_CONTINUE) can remain the same as before. > > > Please let me know if you have more questions! > > [1] https://lore.kernel.org/kvm/1f67639d-c6a2-1f36-b086-eb65fa2ab275@xxxxxxx/T/#m28055e5d708103d126985e38e18b591d535e1e84 > > > > > > Just trying to understand how this will work for the existing interfaces. > > Best regards, > > Pankaj > > > > > > > > Support for the new option (KVM_CAP_EXIT_ON_MISSING) is added to the > > > demand_paging_test, which demonstrates the scalability improvements: > > > the following data was collected using [2] on an x86 machine with 256 > > > cores. > > > > > > vCPUs, Average Paging Rate (w/o new caps), Average Paging Rate (w/ new caps) > > > 1 150 340 > > > 2 191 477 > > > 4 210 809 > > > 8 155 1239 > > > 16 130 1595 > > > 32 108 2299 > > > 64 86 3482 > > > 128 62 4134 > > > 256 36 4012 > > > > > > The diff since the last version is small enough that I've attached a > > > range-diff in the cover letter- hopefully it's useful for review. > > > > > > Links > > > ~~~~~ > > > [1] Original RFC from James Houghton: > > > https://lore.kernel.org/linux-mm/CADrL8HVDB3u2EOhXHCrAgJNLwHkj2Lka1B_kkNb0dNwiWiAN_Q@xxxxxxxxxxxxxx/ > > > > > > [2] ./demand_paging_test -b 64M -u MINOR -s shmem -a -v <n> -r <n> [-w] > > > A quick rundown of the new flags (also detailed in later commits) > > > -a registers all of guest memory to a single uffd. > > > -r species the number of reader threads for polling the uffd. > > > -w is what actually enables the new capabilities. > > > All data was collected after applying the entire series > > > > > > --- > > > > > > v7 > > > - Add comment for the upgrade-to-atomic in __gfn_to_pfn_memslot() > > > [James] > > > - Expand description for KVM_MEM_GUEST_MEMFD in kvm/api.rst [James] > > > and split it off into its own commit [Anish] > > > - Update documentation to indicate that KVM_CAP_MEMORY_FAULT_INFO is > > > available on arm [James] > > > - Expand commit message for the "enable KVM_CAP_MEMORY_FAULT_INFO on > > > arm64" commit [Anish] > > > - Drop buggy "fast GUP on read faults" patch [Thanks James!] > > > - Make KVM_MEM_READONLY and KVM_MEM_EXIT_ON_MISSING mutually exclusive > > > [Sean, Oliver] > > > - Drop incorrect "Documentation:" from some shortlogs [Sean] > > > - Add description for the KVM_EXIT_MEMORY_FAULT RWX patch [Sean] > > > - Style issues [Sean] > > > > > > v6: https://lore.kernel.org/kvm/20231109210325.3806151-1-amoorthy@xxxxxxxxxx/ > > > - Rebase onto guest_memfd series [Anish/Sean] > > > - Set write fault flag properly in user_mem_abort() [Oliver] > > > - Reformat unnecessarily multi-line comments [Sean] > > > - Drop the kvm_vcpu_read|write_guest_page() annotations [Sean] > > > - Rename *USERFAULT_ON_MISSING to *EXIT_ON_MISSING [David] > > > - Remove unnecessary rounding in user_mem_abort() annotation [David] > > > - Rewrite logs for KVM_MEM_EXIT_ON_MISSING patches and squash > > > them with the stage-2 fault annotation patches [Sean] > > > - Undo the enum parameter addition to __gfn_to_pfn_memslot(), and just > > > add another boolean parameter instead [Sean] > > > - Better shortlog for the hva_to_pfn_fast() change [Anish] > > > > > > v5: https://lore.kernel.org/kvm/20230908222905.1321305-1-amoorthy@xxxxxxxxxx/ > > > - Rename APIs (again) [Sean] > > > - Initialize hardware_exit_reason along w/ exit_reason on x86 [Isaku] > > > - Reword hva_to_pfn_fast() change commit message [Sean] > > > - Correct style on terminal if statements [Sean] > > > - Switch to kconfig to signal KVM_CAP_USERFAULT_ON_MISSING [Sean] > > > - Add read fault flag for annotated faults [Sean] > > > - read/write_guest_page() changes > > > - Move the annotations into vcpu wrapper fns [Sean] > > > - Reorder parameters [Robert] > > > - Rename kvm_populate_efault_info() to > > > kvm_handle_guest_uaccess_fault() [Sean] > > > - Remove unnecessary EINVAL on trying to enable memory fault info cap [Sean] > > > - Correct description of the faults which hva_to_pfn_fast() can now > > > resolve [Sean] > > > - Eliminate unnecessary parameter added to __kvm_faultin_pfn() [Sean] > > > - Magnanimously accept Sean's rewrite of the handle_error_pfn() > > > annotation [Anish] > > > - Remove vcpu null check from kvm_handle_guest_uaccess_fault [Sean] > > > > > > v4: https://lore.kernel.org/kvm/20230602161921.208564-1-amoorthy@xxxxxxxxxx/T/#t > > > - Fix excessive indentation [Robert, Oliver] > > > - Calculate final stats when uffd handler fn returns an error [Robert] > > > - Remove redundant info from uffd_desc [Robert] > > > - Fix various commit message typos [Robert] > > > - Add comment about suppressed EEXISTs in selftest [Robert] > > > - Add exit_reasons_known definition for KVM_EXIT_MEMORY_FAULT [Robert] > > > - Fix some include/logic issues in self test [Robert] > > > - Rename no-slow-gup cap to KVM_CAP_NOWAIT_ON_FAULT [Oliver, Sean] > > > - Make KVM_CAP_MEMORY_FAULT_INFO informational-only [Oliver, Sean] > > > - Drop most of the annotations from v3: see > > > https://lore.kernel.org/kvm/20230412213510.1220557-1-amoorthy@xxxxxxxxxx/T/#mfe28e6a5015b7cd8c5ea1c351b0ca194aeb33daf > > > - Remove WARN on bare efaults [Sean, Oliver] > > > - Eliminate unnecessary UFFDIO_WAKE call from self test [James] > > > > > > v3: https://lore.kernel.org/kvm/ZEBXi5tZZNxA+jRs@x1n/T/#t > > > - Rework the implementation to be based on two orthogonal > > > capabilities (KVM_CAP_MEMORY_FAULT_INFO and > > > KVM_CAP_NOWAIT_ON_FAULT) [Sean, Oliver] > > > - Change return code of kvm_populate_efault_info [Isaku] > > > - Use kvm_populate_efault_info from arm code [Oliver] > > > > > > v2: https://lore.kernel.org/kvm/20230315021738.1151386-1-amoorthy@xxxxxxxxxx/ > > > > > > This was a bit of a misfire, as I sent my WIP series on the mailing > > > list but was just targeting Sean for some feedback. Oliver Upton and > > > Isaku Yamahata ended up discovering the series and giving me some > > > feedback anyways, so thanks to them :) In the end, there was enough > > > discussion to justify retroactively labeling it as v2, even with the > > > limited cc list. > > > > > > - Introduce KVM_CAP_X86_MEMORY_FAULT_EXIT. > > > - API changes: > > > - Gate KVM_CAP_MEMORY_FAULT_NOWAIT behind > > > KVM_CAP_x86_MEMORY_FAULT_EXIT (on x86 only: arm has no such > > > requirement). > > > - Switched to memslot flag > > > - Take Oliver's simplification to the "allow fast gup for readable > > > faults" logic. > > > - Slightly redefine the return code of user_mem_abort. > > > - Fix documentation errors brought up by Marc > > > - Reword commit messages in imperative mood > > > > > > v1: https://lore.kernel.org/kvm/20230215011614.725983-1-amoorthy@xxxxxxxxxx/ > > > > > > Anish Moorthy (14): > > > KVM: Clarify meaning of hva_to_pfn()'s 'atomic' parameter > > > KVM: Add function comments for __kvm_read/write_guest_page() > > > KVM: Documentation: Make note of the KVM_MEM_GUEST_MEMFD memslot flag > > > KVM: Simplify error handling in __gfn_to_pfn_memslot() > > > KVM: Define and communicate KVM_EXIT_MEMORY_FAULT RWX flags to > > > userspace > > > KVM: Add memslot flag to let userspace force an exit on missing hva > > > mappings > > > KVM: x86: Enable KVM_CAP_EXIT_ON_MISSING and annotate EFAULTs from > > > stage-2 fault handler > > > KVM: arm64: Enable KVM_CAP_MEMORY_FAULT_INFO and annotate fault in the > > > stage-2 fault handler > > > KVM: arm64: Implement and advertise KVM_CAP_EXIT_ON_MISSING > > > KVM: selftests: Report per-vcpu demand paging rate from demand paging > > > test > > > KVM: selftests: Allow many vCPUs and reader threads per UFFD in demand > > > paging test > > > KVM: selftests: Use EPOLL in userfaultfd_util reader threads and > > > signal errors via TEST_ASSERT > > > KVM: selftests: Add memslot_flags parameter to memstress_create_vm() > > > KVM: selftests: Handle memory fault exits in demand_paging_test > > > > > > Documentation/virt/kvm/api.rst | 39 ++- > > > arch/arm64/kvm/Kconfig | 1 + > > > arch/arm64/kvm/arm.c | 1 + > > > arch/arm64/kvm/mmu.c | 7 +- > > > arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +- > > > arch/powerpc/kvm/book3s_64_mmu_radix.c | 2 +- > > > arch/x86/kvm/Kconfig | 1 + > > > arch/x86/kvm/mmu/mmu.c | 8 +- > > > include/linux/kvm_host.h | 21 +- > > > include/uapi/linux/kvm.h | 5 + > > > .../selftests/kvm/aarch64/page_fault_test.c | 4 +- > > > .../selftests/kvm/access_tracking_perf_test.c | 2 +- > > > .../selftests/kvm/demand_paging_test.c | 295 ++++++++++++++---- > > > .../selftests/kvm/dirty_log_perf_test.c | 2 +- > > > .../testing/selftests/kvm/include/memstress.h | 2 +- > > > .../selftests/kvm/include/userfaultfd_util.h | 17 +- > > > tools/testing/selftests/kvm/lib/memstress.c | 4 +- > > > .../selftests/kvm/lib/userfaultfd_util.c | 159 ++++++---- > > > .../kvm/memslot_modification_stress_test.c | 2 +- > > > .../x86_64/dirty_log_page_splitting_test.c | 2 +- > > > virt/kvm/Kconfig | 3 + > > > virt/kvm/kvm_main.c | 46 ++- > > > 22 files changed, 453 insertions(+), 172 deletions(-) > > > > > > Range-diff against v6: > > > 1: 2089d8955538 ! 1: 063d5d109f34 KVM: Documentation: Clarify meaning of hva_to_pfn()'s 'atomic' parameter > > > @@ Metadata > > > Author: Anish Moorthy <amoorthy@xxxxxxxxxx> > > > > > > ## Commit message ## > > > - KVM: Documentation: Clarify meaning of hva_to_pfn()'s 'atomic' parameter > > > + KVM: Clarify meaning of hva_to_pfn()'s 'atomic' parameter > > > > > > - The current docstring can be read as "atomic -> allowed to sleep," when > > > - in fact the intended statement is "atomic -> NOT allowed to sleep." Make > > > - that clearer in the docstring. > > > + The current description can be read as "atomic -> allowed to sleep," > > > + when in fact the intended statement is "atomic -> NOT allowed to sleep." > > > + Make that clearer in the docstring. > > > > > > Signed-off-by: Anish Moorthy <amoorthy@xxxxxxxxxx> > > > > > > 2: 36963c6eee29 ! 2: e038fe64f44a KVM: Documentation: Add docstrings for __kvm_read/write_guest_page() > > > @@ Metadata > > > Author: Anish Moorthy <amoorthy@xxxxxxxxxx> > > > > > > ## Commit message ## > > > - KVM: Documentation: Add docstrings for __kvm_read/write_guest_page() > > > + KVM: Add function comments for __kvm_read/write_guest_page() > > > > > > The (gfn, data, offset, len) order of parameters is a little strange > > > - since "offset" applies to "gfn" rather than to "data". Add docstrings to > > > - make things perfectly clear. > > > + since "offset" applies to "gfn" rather than to "data". Add function > > > + comments to make things perfectly clear. > > > > > > Signed-off-by: Anish Moorthy <amoorthy@xxxxxxxxxx> > > > > > > -: ------------ > 3: 812a2208da95 KVM: Documentation: Make note of the KVM_MEM_GUEST_MEMFD memslot flag > > > 3: 4994835c51f5 = 4: 44cec9bf6166 KVM: Simplify error handling in __gfn_to_pfn_memslot() > > > 4: 3d51224854b1 ! 5: df09c7482fbf KVM: Define and communicate KVM_EXIT_MEMORY_FAULT RWX flags to userspace > > > @@ Metadata > > > ## Commit message ## > > > KVM: Define and communicate KVM_EXIT_MEMORY_FAULT RWX flags to userspace > > > > > > + kvm_prepare_memory_fault_exit() already takes parameters describing the > > > + RWX-ness of the relevant access but doesn't actually do anything with > > > + them. Define and use the flags necessary to pass this information on to > > > + userspace. > > > + > > > Suggested-by: Sean Christopherson <seanjc@xxxxxxxxxx> > > > Signed-off-by: Anish Moorthy <amoorthy@xxxxxxxxxx> > > > > > > 5: 6bab46398020 < -: ------------ KVM: Try using fast GUP to resolve read faults > > > 6: 556e7079c419 ! 6: 6a6993bda462 KVM: Add memslot flag to let userspace force an exit on missing hva mappings > > > @@ Commit message > > > > > > Suggested-by: James Houghton <jthoughton@xxxxxxxxxx> > > > Suggested-by: Sean Christopherson <seanjc@xxxxxxxxxx> > > > - Reviewed-by: James Houghton <jthoughton@xxxxxxxxxx> > > > Signed-off-by: Anish Moorthy <amoorthy@xxxxxxxxxx> > > > > > > ## Documentation/virt/kvm/api.rst ## > > > @@ Documentation/virt/kvm/api.rst: yet and must be cleared on entry. > > > - /* for kvm_userspace_memory_region::flags */ > > > #define KVM_MEM_LOG_DIRTY_PAGES (1UL << 0) > > > #define KVM_MEM_READONLY (1UL << 1) > > > -+ #define KVM_MEM_GUEST_MEMFD (1UL << 2) > > > + #define KVM_MEM_GUEST_MEMFD (1UL << 2) > > > + #define KVM_MEM_EXIT_ON_MISSING (1UL << 3) > > > > > > This ioctl allows the user to create, modify or delete a guest physical > > > @@ Documentation/virt/kvm/api.rst: It is recommended that the lower 21 bits of gues > > > be identical. This allows large pages in the guest to be backed by large > > > pages in the host. > > > > > > --The flags field supports two flags: KVM_MEM_LOG_DIRTY_PAGES and > > > --KVM_MEM_READONLY. The former can be set to instruct KVM to keep track of > > > +-The flags field supports three flags > > > +The flags field supports four flags > > > -+ > > > -+1. KVM_MEM_LOG_DIRTY_PAGES: can be set to instruct KVM to keep track of > > > + > > > + 1. KVM_MEM_LOG_DIRTY_PAGES: can be set to instruct KVM to keep track of > > > writes to memory within the slot. See KVM_GET_DIRTY_LOG ioctl to know how to > > > --use it. The latter can be set, if KVM_CAP_READONLY_MEM capability allows it, > > > -+use it. > > > -+2. KVM_MEM_READONLY: can be set, if KVM_CAP_READONLY_MEM capability allows it, > > > - to make a new slot read-only. In this case, writes to this memory will be > > > +@@ Documentation/virt/kvm/api.rst: to make a new slot read-only. In this case, writes to this memory will be > > > posted to userspace as KVM_EXIT_MMIO exits. > > > -+3. KVM_MEM_GUEST_MEMFD > > > + 3. KVM_MEM_GUEST_MEMFD: see KVM_SET_USER_MEMORY_REGION2. This flag is > > > + incompatible with KVM_SET_USER_MEMORY_REGION. > > > +4. KVM_MEM_EXIT_ON_MISSING: see KVM_CAP_EXIT_ON_MISSING for details. > > > > > > When the KVM_CAP_SYNC_MMU capability is available, changes in the backing of > > > the memory region are automatically reflected into the guest. For example, an > > > +@@ Documentation/virt/kvm/api.rst: Instead, an abort (data abort if the cause of the page-table update > > > + was a load or a store, instruction abort if it was an instruction > > > + fetch) is injected in the guest. > > > + > > > ++Note: KVM_MEM_READONLY and KVM_MEM_EXIT_ON_MISSING are currently mutually > > > ++exclusive. > > > ++ > > > + 4.36 KVM_SET_TSS_ADDR > > > + --------------------- > > > + > > > @@ Documentation/virt/kvm/api.rst: error/annotated fault. > > > > > > See KVM_EXIT_MEMORY_FAULT for more information. > > > @@ include/uapi/linux/kvm.h: struct kvm_userspace_memory_region2 { > > > > > > /* for KVM_IRQ_LINE */ > > > struct kvm_irq_level { > > > -@@ include/uapi/linux/kvm.h: struct kvm_ppc_resize_hpt { > > > +@@ include/uapi/linux/kvm.h: struct kvm_enable_cap { > > > #define KVM_CAP_MEMORY_ATTRIBUTES 233 > > > #define KVM_CAP_GUEST_MEMFD 234 > > > #define KVM_CAP_VM_TYPES 235 > > > +#define KVM_CAP_EXIT_ON_MISSING 236 > > > > > > - #ifdef KVM_CAP_IRQ_ROUTING > > > - > > > + struct kvm_irq_routing_irqchip { > > > + __u32 irqchip; > > > > > > ## virt/kvm/Kconfig ## > > > @@ virt/kvm/Kconfig: config KVM_GENERIC_PRIVATE_MEM > > > @@ virt/kvm/kvm_main.c: static int check_memory_region_flags(struct kvm *kvm, > > > + > > > if (mem->flags & ~valid_flags) > > > return -EINVAL; > > > ++ else if ((mem->flags & KVM_MEM_READONLY) && > > > ++ (mem->flags & KVM_MEM_EXIT_ON_MISSING)) > > > ++ return -EINVAL; > > > > > > + return 0; > > > + } > > > @@ virt/kvm/kvm_main.c: kvm_pfn_t hva_to_pfn(unsigned long addr, bool atomic, bool interruptible, > > > > > > kvm_pfn_t __gfn_to_pfn_memslot(const struct kvm_memory_slot *slot, gfn_t gfn, > > > @@ virt/kvm/kvm_main.c: kvm_pfn_t __gfn_to_pfn_memslot(const struct kvm_memory_slot > > > writable = NULL; > > > } > > > > > > -+ if (!atomic && can_exit_on_missing > > > -+ && kvm_is_slot_exit_on_missing(slot)) { > > > ++ /* When the slot is exit-on-missing (and when we should respect that) > > > ++ * set atomic=true to prevent GUP from faulting in the userspace > > > ++ * mappings. > > > ++ */ > > > ++ if (!atomic && can_exit_on_missing && > > > ++ kvm_is_slot_exit_on_missing(slot)) { > > > + atomic = true; > > > + if (async) { > > > + *async = false; > > > 7: 28b6fe1ad5b9 ! 7: 70696937be14 KVM: x86: Enable KVM_CAP_EXIT_ON_MISSING and annotate EFAULTs from stage-2 fault handler > > > @@ Documentation/virt/kvm/api.rst: See KVM_EXIT_MEMORY_FAULT for more information. > > > > > > ## arch/x86/kvm/Kconfig ## > > > @@ arch/x86/kvm/Kconfig: config KVM > > > - select INTERVAL_TREE > > > + select KVM_VFIO > > > select HAVE_KVM_PM_NOTIFIER if PM > > > select KVM_GENERIC_HARDWARE_ENABLING > > > + select HAVE_KVM_EXIT_ON_MISSING > > > 8: a80db5672168 < -: ------------ KVM: arm64: Enable KVM_CAP_MEMORY_FAULT_INFO > > > -: ------------ > 8: 05bbf29372ed KVM: arm64: Enable KVM_CAP_MEMORY_FAULT_INFO and annotate fault in the stage-2 fault handler > > > 9: 70c5db4f5c9e ! 9: bb22b31c8437 KVM: arm64: Enable KVM_CAP_EXIT_ON_MISSING and annotate an EFAULT from stage-2 fault-handler > > > @@ Metadata > > > Author: Anish Moorthy <amoorthy@xxxxxxxxxx> > > > > > > ## Commit message ## > > > - KVM: arm64: Enable KVM_CAP_EXIT_ON_MISSING and annotate an EFAULT from stage-2 fault-handler > > > + KVM: arm64: Implement and advertise KVM_CAP_EXIT_ON_MISSING > > > > > > Prevent the stage-2 fault handler from faulting in pages when > > > KVM_MEM_EXIT_ON_MISSING is set by allowing its __gfn_to_pfn_memslot() > > > - calls to check the memslot flag. > > > - > > > - To actually make that behavior useful, prepare a KVM_EXIT_MEMORY_FAULT > > > - when the stage-2 handler cannot resolve the pfn for a fault. With > > > - KVM_MEM_EXIT_ON_MISSING enabled this effects the delivery of stage-2 > > > - faults as vCPU exits, which userspace can attempt to resolve without > > > - terminating the guest. > > > + call to check the memslot flag. This effects the delivery of stage-2 > > > + faults as vCPU exits (see KVM_CAP_MEMORY_FAULT_INFO), which userspace > > > + can attempt to resolve without terminating the guest. > > > > > > Delivering stage-2 faults to userspace in this way sidesteps the > > > significant scalabiliy issues associated with using userfaultfd for the > > > @@ Documentation/virt/kvm/api.rst: See KVM_EXIT_MEMORY_FAULT for more information. > > > > > > ## arch/arm64/kvm/Kconfig ## > > > @@ arch/arm64/kvm/Kconfig: menuconfig KVM > > > + select SCHED_INFO > > > select GUEST_PERF_EVENTS if PERF_EVENTS > > > - select INTERVAL_TREE > > > select XARRAY_MULTI > > > + select HAVE_KVM_EXIT_ON_MISSING > > > help > > > @@ arch/arm64/kvm/mmu.c: static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr > > > if (pfn == KVM_PFN_ERR_HWPOISON) { > > > kvm_send_hwpoison_signal(hva, vma_shift); > > > return 0; > > > - } > > > -- if (is_error_noslot_pfn(pfn)) > > > -+ if (is_error_noslot_pfn(pfn)) { > > > -+ kvm_prepare_memory_fault_exit(vcpu, gfn * PAGE_SIZE, PAGE_SIZE, > > > -+ write_fault, exec_fault, false); > > > - return -EFAULT; > > > -+ } > > > - > > > - if (kvm_is_device_pfn(pfn)) { > > > - /* > > > 10: ab913b9b5570 = 10: a62ee8593b84 KVM: selftests: Report per-vcpu demand paging rate from demand paging test > > > 11: a27ff8b097d7 ! 11: 58ddb652eac1 KVM: selftests: Allow many vCPUs and reader threads per UFFD in demand paging test > > > @@ Commit message > > > configuring the number of reader threads per UFFD as well: add the "-r" > > > flag to do so. > > > > > > - Acked-by: James Houghton <jthoughton@xxxxxxxxxx> > > > Signed-off-by: Anish Moorthy <amoorthy@xxxxxxxxxx> > > > + Acked-by: James Houghton <jthoughton@xxxxxxxxxx> > > > > > > ## tools/testing/selftests/kvm/aarch64/page_fault_test.c ## > > > @@ tools/testing/selftests/kvm/aarch64/page_fault_test.c: static void setup_uffd(struct kvm_vm *vm, struct test_params *p, > > > 12: ee196df32964 ! 12: b4cfe82097e2 KVM: selftests: Use EPOLL in userfaultfd_util reader threads and signal errors via TEST_ASSERT > > > @@ Commit message > > > [1] Single-vCPU performance does suffer somewhat. > > > [2] ./demand_paging_test -u MINOR -s shmem -v 4 -o -r <num readers> > > > > > > - Acked-by: James Houghton <jthoughton@xxxxxxxxxx> > > > Signed-off-by: Anish Moorthy <amoorthy@xxxxxxxxxx> > > > + Acked-by: James Houghton <jthoughton@xxxxxxxxxx> > > > > > > ## tools/testing/selftests/kvm/demand_paging_test.c ## > > > @@ > > > 13: 9406cb2581e5 = 13: f8095728fcef KVM: selftests: Add memslot_flags parameter to memstress_create_vm() > > > 14: dbab5917e1f6 ! 14: a5863f1206bb KVM: selftests: Handle memory fault exits in demand_paging_test > > > @@ Commit message > > > > > > Demonstrate a (very basic) scheme for supporting memory fault exits. > > > > > > - >From the vCPU threads: > > > + From the vCPU threads: > > > 1. Simply issue UFFDIO_COPY/CONTINUEs in response to memory fault exits, > > > with the purpose of establishing the absent mappings. Do so with > > > wake_waiters=false to avoid serializing on the userfaultfd wait queue > > > @@ Commit message > > > [A] In reality it is much likelier that the vCPU thread simply lost a > > > race to establish the mapping for the page. > > > > > > - Acked-by: James Houghton <jthoughton@xxxxxxxxxx> > > > Signed-off-by: Anish Moorthy <amoorthy@xxxxxxxxxx> > > > + Acked-by: James Houghton <jthoughton@xxxxxxxxxx> > > > > > > ## tools/testing/selftests/kvm/demand_paging_test.c ## > > > @@ > > > > > > base-commit: 687d8f4c3dea0758afd748968d91288220bbe7e3 > >