Am 24/10/2022 um 09:43 schrieb Emanuele Giuseppe Esposito: > > > Am 23/10/2022 um 19:48 schrieb Paolo Bonzini: >> On 10/22/22 17:48, Emanuele Giuseppe Esposito wrote: >>> Once a vcpu exectues KVM_RUN, it could enter two states: >>> enter guest mode, or block/halt. >>> Use a signal to allow a vcpu to exit the guest state or unblock, >>> so that it can exit KVM_RUN and release the read semaphore, >>> allowing a pending KVM_KICK_ALL_RUNNING_VCPUS to continue. >>> >>> Note that the signal is not deleted and used to propagate the >>> exit reason till vcpu_run(). It will be clearead only by >>> KVM_RESUME_ALL_KICKED_VCPUS. This allows the vcpu to keep try >>> entering KVM_RUN and perform again all checks done in >>> kvm_arch_vcpu_ioctl_run() before entering the guest state, >>> where it will return again if the request is still set. >>> >>> However, the userspace hypervisor should also try to avoid >>> continuously calling KVM_RUN after invoking KVM_KICK_ALL_RUNNING_VCPUS, >>> because such call will just translate in a back-to-back down_read() >>> and up_read() (thanks to the signal). >> >> Since the userspace should anyway avoid going into this effectively-busy >> wait, what about clearing the request after the first exit? The >> cancellation ioctl can be kept for vCPUs that are never entered after >> KVM_KICK_ALL_RUNNING_VCPUS. Alternatively, kvm_clear_all_cpus_request >> could be done right before up_write(). > > Clearing makes sense, but should we "trust" the userspace not to go into > busy wait? Actually since this change is just a s/test/check, I would rather keep test. If userspace does things wrong, this mechanism would still work properly. > What's the typical "contract" between KVM and the userspace? Meaning, > should we cover the basic usage mistakes like forgetting to busy wait on > KVM_RUN? > > If we don't, I can add a comment when clearing and of course also > mention it in the API documentation (that I forgot to update, sorry :D) > > Emanuele > >> >> Paolo >> >>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@xxxxxxxxxx> >>> --- >>> arch/x86/include/asm/kvm_host.h | 2 ++ >>> arch/x86/kvm/x86.c | 8 ++++++++ >>> virt/kvm/kvm_main.c | 21 +++++++++++++++++++++ >>> 3 files changed, 31 insertions(+) >>> >>> diff --git a/arch/x86/include/asm/kvm_host.h >>> b/arch/x86/include/asm/kvm_host.h >>> index aa381ab69a19..d5c37f344d65 100644 >>> --- a/arch/x86/include/asm/kvm_host.h >>> +++ b/arch/x86/include/asm/kvm_host.h >>> @@ -108,6 +108,8 @@ >>> KVM_ARCH_REQ_FLAGS(30, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP) >>> #define KVM_REQ_MMU_FREE_OBSOLETE_ROOTS \ >>> KVM_ARCH_REQ_FLAGS(31, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP) >>> +#define KVM_REQ_USERSPACE_KICK \ >>> + KVM_ARCH_REQ_FLAGS(32, KVM_REQUEST_WAIT) >>> #define >>> CR0_RESERVED_BITS \ >>> (~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM | >>> X86_CR0_TS \ >>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >>> index b0c47b41c264..2af5f427b4e9 100644 >>> --- a/arch/x86/kvm/x86.c >>> +++ b/arch/x86/kvm/x86.c >>> @@ -10270,6 +10270,10 @@ static int vcpu_enter_guest(struct kvm_vcpu >>> *vcpu) >>> } >>> if (kvm_request_pending(vcpu)) { >>> + if (kvm_test_request(KVM_REQ_USERSPACE_KICK, vcpu)) { >>> + r = -EINTR; >>> + goto out; >>> + } >>> if (kvm_check_request(KVM_REQ_VM_DEAD, vcpu)) { >>> r = -EIO; >>> goto out; >>> @@ -10701,6 +10705,10 @@ static int vcpu_run(struct kvm_vcpu *vcpu) >>> r = vcpu_block(vcpu); >>> } >>> + /* vcpu exited guest/unblocked because of this request */ >>> + if (kvm_test_request(KVM_REQ_USERSPACE_KICK, vcpu)) >>> + return -EINTR; >>> + >>> if (r <= 0) >>> break; >>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c >>> index ae0240928a4a..13fa7229b85d 100644 >>> --- a/virt/kvm/kvm_main.c >>> +++ b/virt/kvm/kvm_main.c >>> @@ -3431,6 +3431,8 @@ static int kvm_vcpu_check_block(struct kvm_vcpu >>> *vcpu) >>> goto out; >>> if (kvm_check_request(KVM_REQ_UNBLOCK, vcpu)) >>> goto out; >>> + if (kvm_test_request(KVM_REQ_USERSPACE_KICK, vcpu)) >>> + goto out; >>> ret = 0; >>> out: >>> @@ -4668,6 +4670,25 @@ static long kvm_vm_ioctl(struct file *filp, >>> r = kvm_vm_ioctl_enable_cap_generic(kvm, &cap); >>> break; >>> } >>> + case KVM_KICK_ALL_RUNNING_VCPUS: { >>> + /* >>> + * Notify all running vcpus that they have to stop. >>> + * Caught in kvm_arch_vcpu_ioctl_run() >>> + */ >>> + kvm_make_all_cpus_request(kvm, KVM_REQ_USERSPACE_KICK); >>> + >>> + /* >>> + * Use wr semaphore to wait for all vcpus to exit from KVM_RUN. >>> + */ >>> + down_write(&memory_transaction); >>> + up_write(&memory_transaction); >>> + break; >>> + } >>> + case KVM_RESUME_ALL_KICKED_VCPUS: { >>> + /* Remove all requests sent with KVM_KICK_ALL_RUNNING_VCPUS */ >>> + kvm_clear_all_cpus_request(kvm, KVM_REQ_USERSPACE_KICK); >>> + break; >>> + } >>> case KVM_SET_USER_MEMORY_REGION: { >>> struct kvm_userspace_memory_region kvm_userspace_mem; >>> >>