Re: [PATCH] KVM: nVMX: Fix direct injection of interrupts from L0 to L2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2013-02-17 16:07, Gleb Natapov wrote:
> On Sat, Feb 16, 2013 at 06:10:14PM +0100, Jan Kiszka wrote:
>> From: Jan Kiszka <jan.kiszka@xxxxxxxxxxx>
>>
>> If L1 does not set PIN_BASED_EXT_INTR_MASK, we incorrectly skipped
>> vmx_complete_interrupts on L2 exits. This is required because, with
>> direct interrupt injection from L0 to L2, L0 has to update its pending
>> events.
>>
>> Also, we need to allow vmx_cancel_injection when entering L2 in we left
>> to L0. This condition is indirectly derived from the absence of valid
>> vectoring info in vmcs12. We no explicitly clear it if we find out that
>> the L2 exit is not targeting L1 but L0.
>>
> We really need to overhaul how interrupt injection is emulated in nested
> VMX. Why not put pending events into event queue instead of
> get_vmcs12(vcpu)->idt_vectoring_info_field and inject them in usual way.

I was thinking about the same step but felt unsure so far if
vmx_complete_interrupts & Co. do not include any assumptions about the
vmcs configuration that won't match what L1 does. So I went for a
different path first, specifically to avoid impact on these hairy bits
for non-nested mode.

> 
>> Signed-off-by: Jan Kiszka <jan.kiszka@xxxxxxxxxxx>
>> ---
>>  arch/x86/kvm/vmx.c |   43 +++++++++++++++++++++++++++----------------
>>  1 files changed, 27 insertions(+), 16 deletions(-)
>>
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index 68a045ae..464b6a5 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -624,6 +624,7 @@ static void vmx_get_segment(struct kvm_vcpu *vcpu,
>>  			    struct kvm_segment *var, int seg);
>>  static bool guest_state_valid(struct kvm_vcpu *vcpu);
>>  static u32 vmx_segment_access_rights(struct kvm_segment *var);
>> +static void vmx_complete_interrupts(struct vcpu_vmx *vmx);
>>  
>>  static DEFINE_PER_CPU(struct vmcs *, vmxarea);
>>  static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
>> @@ -6213,9 +6214,19 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu)
>>  	else
>>  		vmx->nested.nested_run_pending = 0;
>>  
>> -	if (is_guest_mode(vcpu) && nested_vmx_exit_handled(vcpu)) {
>> -		nested_vmx_vmexit(vcpu);
>> -		return 1;
>> +	if (is_guest_mode(vcpu)) {
>> +		if (nested_vmx_exit_handled(vcpu)) {
>> +			nested_vmx_vmexit(vcpu);
>> +			return 1;
>> +		}
>> +		/*
>> +		 * Now it's clear, we are leaving to L0. Perform the postponed
>> +		 * interrupt completion and clear L1's vectoring info field so
>> +		 * that we do not overwrite what L0 wants to inject on
>> +		 * re-entry.
>> +		 */
>> +		vmx_complete_interrupts(vmx);
>> +		get_vmcs12(vcpu)->idt_vectoring_info_field = 0;
>>  	}
>>  
>>  	if (exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY) {
>> @@ -6495,8 +6506,6 @@ static void __vmx_complete_interrupts(struct vcpu_vmx *vmx,
>>  
>>  static void vmx_complete_interrupts(struct vcpu_vmx *vmx)
>>  {
>> -	if (is_guest_mode(&vmx->vcpu))
>> -		return;
>>  	__vmx_complete_interrupts(vmx, vmx->idt_vectoring_info,
>>  				  VM_EXIT_INSTRUCTION_LEN,
>>  				  IDT_VECTORING_ERROR_CODE);
>> @@ -6504,7 +6513,9 @@ static void vmx_complete_interrupts(struct vcpu_vmx *vmx)
>>  
>>  static void vmx_cancel_injection(struct kvm_vcpu *vcpu)
>>  {
>> -	if (is_guest_mode(vcpu))
>> +	if (is_guest_mode(vcpu) &&
>> +	    get_vmcs12(vcpu)->idt_vectoring_info_field &
>> +			VECTORING_INFO_VALID_MASK)
> Why skip cancel_injection at all? As far as I see we can lose injected
> irq if we do. Consider:
> 
>   io thread                                  vcpu in nested mode
> set irr 200
>                                           clear irr 200 set isr 200
>                                           set 200 in VM_ENTRY_INTR_INFO_FIELD
> set irr 250
> set KVM_REQ_EVENT
>                                           if (KVM_REQ_EVENT)
>                                                   vmx_cancel_injection() <- does nothing

No, it does cancel as vmcs12's idt_vectoring_info_field signals an
invalid state then. Only if we left L2 with valid vectoring info and are
about to reenter, we skip the cancellation - but in that case we didn't
inject anything from L0 previously anyway.

Jan

> 
>                                           clear irr 250 set isr 250
>                                           set 250 in VM_ENTRY_INTR_INFO_FIELD
>                                           vmentry
> 
> So now APIC state is bogus. isr bit 200 is set but vector 200 was never
> injected and actually is lost forever. Next EOI will clear isr 250 and
> isr 200 will block all lower level interrupt forever.
> 
> --
> 			Gleb.
> 


Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux