Hi Sean, here's what I'm planing to send up as v2 of the scalable userfaultfd series. Don't worry, I'm not asking you to review this all :) I just have a few remaining questions regarding KVM_CAP_MEMORY_FAULT_EXIT which seem important enough to mention before I ask for more attention from others, and they'll be clearer with the patches in hand. Anything else I'm happy to find out about when I send the actual v2. I want your opinion on 1. The general API I've set up for KVM_CAP_MEMORY_FAULT_EXIT (described in the api.rst file) 2. Whether the UNKNOWN exit reason cases (everywhere but handle_error_pfn atm) would need to be given "real" reasons before this could be merged. 3. If you think I've missed sites that currently -EFAULT to userspace About (3): after we agreed to only tackle cases where -EFAULT currently makes it to userspace, I went though our list and tried to trace which EFAULTS actually bubble up to KVM_RUN. That set ended being suspiciously small, so I wanted to sanity-check my findings with you. Lmk if you see obvious errors in my list below. --- EFAULTs under KVM_RUN --- Confident that needs conversion (already converted) --------------------------------------------------- * direct_map * handle_error_pfn * setup_vmgexit_scratch * kvm_handle_page_fault * FNAME(fetch) EFAULT does not propagate to userspace (do not convert) ------------------------------------------------------- * record_steal_time (arch/x86/kvm/x86.c:3463) * hva_to_pfn_retry * kvm_vcpu_map * FNAME(update_accessed_dirty_bits) * __kvm_gfn_to_hva_cache_init Might actually make it to userspace, but only through kvm_read|write_guest_offset_cached- would be covered by those conversions * kvm_gfn_to_hva_cache_init * __kvm_read_guest_page * hva_to_pfn_remapped handle_error_pfn will handle this for the scalable uffd case. Don't think other callers -EFAULT to userspace. Still unsure if needs conversion -------------------------------- * __kvm_read_guest_atomic The EFAULT might be propagated though FNAME(sync_page)? * kvm_write_guest_offset_cached (virt/kvm/kvm_main.c:3226) * __kvm_write_guest_page Called from kvm_write_guest_offset_cached: if that needs change, this does too * kvm_write_guest_page Two interesting paths: - kvm_pv_clock_pairing returns a custom KVM_EFAULT error here (arch/x86/kvm/x86.c:9578) - kvm_write_guest_offset_cached returns this directly (so if that needs change, this does too) * kvm_read_guest_offset_cached I actually do see a path to userspace, but it's through hyper-v, which we've said is out of scope for round 1. --- Actual Cover Letter --- Omitted: hasn't changed much since v1 anyways --- Changelog --- WIP v2 - Introduce KVM_CAP_X86_MEMORY_FAULT_EXIT. - API changes: - Gate KVM_CAP_MEMORY_FAULT_NOWAIT behind KVM_CAP_x86_MEMORY_FAULT_EXIT (on x86 only: arm has no such requirement). - Switched to memslot flag - Take Oliver's simplification to the "allow fast gup for readable faults" logic. - Slightly redefine the return code of user_mem_abort. - Fix documentation errors brought up by Marc - Reword commit messages in imperative mood v1: https://lore.kernel.org/kvm/20230215011614.725983-1-amoorthy@xxxxxxxxxx/ Anish Moorthy (14): KVM: selftests: Allow many vCPUs and reader threads per UFFD in demand paging test KVM: selftests: Use EPOLL in userfaultfd_util reader threads and signal errors via TEST_ASSERT KVM: Allow hva_pfn_fast to resolve read-only faults. KVM: x86: Add KVM_CAP_X86_MEMORY_FAULT_EXIT and associated kvm_run field KVM: x86: Implement memory fault exit for direct_map KVM: x86: Implement memory fault exit for kvm_handle_page_fault KVM: x86: Implement memory fault exit for setup_vmgexit_scratch KVM: x86: Implement memory fault exit for FNAME(fetch) KVM: Introduce KVM_CAP_MEMORY_FAULT_NOWAIT without implementation KVM: x86: Implement KVM_CAP_MEMORY_FAULT_NOWAIT KVM: arm64: Allow user_mem_abort to return 0 to signal a 'normal' exit KVM: arm64: Implement KVM_CAP_MEMORY_FAULT_NOWAIT KVM: selftests: Add memslot_flags parameter to memstress_create_vm KVM: selftests: Handle memory fault exits in demand_paging_test Documentation/virt/kvm/api.rst | 74 ++++- arch/arm64/kvm/arm.c | 1 + arch/arm64/kvm/mmu.c | 29 +- arch/x86/kvm/mmu/mmu.c | 42 ++- arch/x86/kvm/mmu/paging_tmpl.h | 4 +- arch/x86/kvm/svm/sev.c | 4 +- arch/x86/kvm/x86.c | 2 + include/linux/kvm_host.h | 22 ++ include/uapi/linux/kvm.h | 19 ++ tools/include/uapi/linux/kvm.h | 17 ++ .../selftests/kvm/aarch64/page_fault_test.c | 4 +- .../selftests/kvm/access_tracking_perf_test.c | 2 +- .../selftests/kvm/demand_paging_test.c | 253 ++++++++++++++---- .../selftests/kvm/dirty_log_perf_test.c | 2 +- .../testing/selftests/kvm/include/memstress.h | 2 +- .../selftests/kvm/include/userfaultfd_util.h | 18 +- tools/testing/selftests/kvm/lib/memstress.c | 4 +- .../selftests/kvm/lib/userfaultfd_util.c | 160 ++++++----- .../kvm/memslot_modification_stress_test.c | 2 +- virt/kvm/kvm_main.c | 41 ++- 20 files changed, 544 insertions(+), 158 deletions(-) -- 2.40.0.rc1.284.g88254d51c5-goog