Mark a vCPU as preempted/ready if-and-only-if it's scheduled out while running. i.e. Do not mark a vCPU preempted/ready if it's scheduled out during a non-KVM_RUN ioctl() or when userspace is doing KVM_RUN with immediate_exit. Commit 54aa83c90198 ("KVM: x86: do not set st->preempted when going back to user space") stopped marking a vCPU as preempted when returning to userspace, but if userspace then invokes a KVM vCPU ioctl() that gets preempted, the vCPU will be marked preempted/ready. This is arguably incorrect behavior since the vCPU was not actually preempted while the guest was running, it was preempted while doing something on behalf of userspace. This commit also avoids KVM dirtying guest memory after userspace has paused vCPUs, e.g. for Live Migration, which allows userspace to collect the final dirty bitmap before or in parallel with saving vCPU state without having to worry about saving vCPU state triggering writes to guest memory. Suggested-by: Sean Christopherson <seanjc@xxxxxxxxxx> Signed-off-by: David Matlack <dmatlack@xxxxxxxxxx> --- v2: - Drop Google-specific "PRODKERNEL: " shortlog prefix v1: https://lore.kernel.org/kvm/20231218185850.1659570-1-dmatlack@xxxxxxxxxx/ include/linux/kvm_host.h | 1 + virt/kvm/kvm_main.c | 5 ++++- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 7e7fd25b09b3..5b2300614d22 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -378,6 +378,7 @@ struct kvm_vcpu { bool dy_eligible; } spin_loop; #endif + bool wants_to_run; bool preempted; bool ready; struct kvm_vcpu_arch arch; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index ff588677beb7..3da1b2e3785d 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -4438,7 +4438,10 @@ static long kvm_vcpu_ioctl(struct file *filp, synchronize_rcu(); put_pid(oldpid); } + vcpu->wants_to_run = !vcpu->run->immediate_exit; r = kvm_arch_vcpu_ioctl_run(vcpu); + vcpu->wants_to_run = false; + trace_kvm_userspace_exit(vcpu->run->exit_reason, r); break; } @@ -6312,7 +6315,7 @@ static void kvm_sched_out(struct preempt_notifier *pn, { struct kvm_vcpu *vcpu = preempt_notifier_to_vcpu(pn); - if (current->on_rq) { + if (current->on_rq && vcpu->wants_to_run) { WRITE_ONCE(vcpu->preempted, true); WRITE_ONCE(vcpu->ready, true); } base-commit: 687d8f4c3dea0758afd748968d91288220bbe7e3 -- 2.44.0.278.ge034bb2e1d-goog