2017-02-08 12:10+0100, Paolo Bonzini: > The purpose of the KVM_SET_SIGNAL_MASK API is to let userspace "kick" > a VCPU out of KVM_RUN through a POSIX signal. A signal is attached > to a dummy signal handler; by blocking the signal outside KVM_RUN and > unblocking it inside, this possible race is closed: > > VCPU thread service thread > -------------------------------------------------------------- > check flag > set flag > raise signal > (signal handler does nothing) > KVM_RUN > > However, one issue with KVM_SET_SIGNAL_MASK is that it has to take > tsk->sighand->siglock on every KVM_RUN. This lock is often on a > remote NUMA node, because it is on the node of a thread's creator. > Taking this lock can be very expensive if there are many userspace > exits (as is the case for SMP Windows VMs without Hyper-V reference > time counter). > > As an alternative, we can put the flag directly in kvm_run so that > KVM can see it: > > VCPU thread service thread > -------------------------------------------------------------- > raise signal > signal handler > set run->immediate_exit > KVM_RUN > check run->immediate_exit > > Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> > --- > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > @@ -2564,9 +2565,15 @@ static long kvm_vcpu_ioctl(struct file *filp, > synchronize_rcu(); > put_pid(oldpid); > } > - r = kvm_arch_vcpu_ioctl_run(vcpu, vcpu->run); > - trace_kvm_userspace_exit(vcpu->run->exit_reason, r); > + run = vcpu->run; > + if (run->immediate_exit) { > + WRITE_ONCE(run->immediate_exit, 0); > + return -EINTR; > + } QEMU also uses self-kick to complete IO, but run->immediate_exit is checked too soon for that. I think we should move it at least into kvm_arch_vcpu_ioctl_run(), to cover two uses of the interrupt mask. (I don't remember the reason behind QEMU's mask on SIGBUS any more.) Thanks. > + r = kvm_arch_vcpu_ioctl_run(vcpu, run); > + trace_kvm_userspace_exit(run->exit_reason, r); > break; > + } > case KVM_GET_REGS: { > struct kvm_regs *kvm_regs; >