On Thu, Sep 22, 2011 at 01:52:56PM +0300, Nadav Har'El wrote: > This patch adds a new vcpu->requests bit, KVM_REQ_IMMEDIATE_EXIT. > This bit requests that when next entering the guest, we should run it only > for as little as possible, and exit again. > > We use this new option in nested VMX: When L1 launches L2, but L0 wishes L1 > to continue running so it can inject an event to it, we unfortunately cannot > just pretend to have run L2 for a little while - We must really launch L2, > otherwise certain one-off vmcs12 parameters (namely, L1 injection into L2) > will be lost. So the existing code runs L2 in this case. > But L2 could potentially run for a long time until it exits, and the > injection into L1 will be delayed. The new KVM_REQ_IMMEDIATE_EXIT allows us > to request that L2 will be entered, as necessary, but will exit as soon as > possible after entry. > > Our implementation of this request uses smp_send_reschedule() to send a > self-IPI, with interrupts disabled. The interrupts remain disabled until the > guest is entered, and then, after the entry is complete (often including > processing an injection and jumping to the relevant handler), the physical > interrupt is noticed and causes an exit. > > On recent Intel processors, we could have achieved the same goal by using > MTF instead of a self-IPI. Another technique worth considering in the future > is to use VM_EXIT_ACK_INTR_ON_EXIT and a highest-priority vector IPI - to > slightly improve performance by avoiding the useless interrupt handler > which ends up being called when smp_send_reschedule() is used. > > Signed-off-by: Nadav Har'El <nyh@xxxxxxxxxx> > --- > arch/x86/kvm/vmx.c | 11 +++++++---- > arch/x86/kvm/x86.c | 6 ++++++ > include/linux/kvm_host.h | 1 + > 3 files changed, 14 insertions(+), 4 deletions(-) > > --- .before/include/linux/kvm_host.h 2011-09-22 13:51:31.000000000 +0300 > +++ .after/include/linux/kvm_host.h 2011-09-22 13:51:31.000000000 +0300 > @@ -48,6 +48,7 @@ > #define KVM_REQ_EVENT 11 > #define KVM_REQ_APF_HALT 12 > #define KVM_REQ_STEAL_UPDATE 13 > +#define KVM_REQ_IMMEDIATE_EXIT 14 > > #define KVM_USERSPACE_IRQ_SOURCE_ID 0 > > --- .before/arch/x86/kvm/x86.c 2011-09-22 13:51:31.000000000 +0300 > +++ .after/arch/x86/kvm/x86.c 2011-09-22 13:51:31.000000000 +0300 > @@ -5610,6 +5610,7 @@ static int vcpu_enter_guest(struct kvm_v > bool nmi_pending; > bool req_int_win = !irqchip_in_kernel(vcpu->kvm) && > vcpu->run->request_interrupt_window; > + bool req_immediate_exit = 0; > > if (vcpu->requests) { > if (kvm_check_request(KVM_REQ_MMU_RELOAD, vcpu)) > @@ -5647,6 +5648,8 @@ static int vcpu_enter_guest(struct kvm_v > } > if (kvm_check_request(KVM_REQ_STEAL_UPDATE, vcpu)) > record_steal_time(vcpu); > + req_immediate_exit = > + kvm_check_request(KVM_REQ_IMMEDIATE_EXIT, vcpu); The immediate exit information can be lost if entry decides to bail out. You can do req_immediate_exit = kvm_check_request(KVM_REQ_IMMEDIATE_EXIT) after preempt_disable() and then transfer back the bit in the bail out case in if (vcpu->mode == EXITING_GUEST_MODE || vcpu->requests ... -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html