Re: [PATCH v4] KVM: nVMX: Fully support of nested VMX preemption timer

Paolo Bonzini <pbonzini@xxxxxxxxxx> · Mon, 16 Sep 2013 10:58:12 +0200

Il 16/09/2013 09:44, Gleb Natapov ha scritto:
> On Fri, Sep 13, 2013 at 07:15:11PM +0200, Paolo Bonzini wrote:
>> Il 06/09/2013 04:04, Arthur Chunqi Li ha scritto:
>>> +static void nested_adjust_preemption_timer(struct kvm_vcpu *vcpu)
>>> +{
>>> +	u64 delta_tsc_l1;
>>> +	u32 preempt_val_l1, preempt_val_l2, preempt_scale;
>>
>> Should this exit immediately if the preemption timer pin-based control
>> is disabled?
>>
>>> +	preempt_scale = native_read_msr(MSR_IA32_VMX_MISC) &
>>> +			MSR_IA32_VMX_MISC_PREEMPTION_TIMER_SCALE;
>>> +	preempt_val_l2 = vmcs_read32(VMX_PREEMPTION_TIMER_VALUE);
>>> +	delta_tsc_l1 = kvm_x86_ops->read_l1_tsc(vcpu,
>>> +			native_read_tsc()) - vcpu->arch.last_guest_tsc;
>>
>> Please format this like:
>>
>> 	delta_tsc_l1 =
>> 		kvm_x86_ops->read_l1_tsc(vcpu, native_read_tsc())
>> 		- vcpu->arch.last_guest_tsc;
>>
> And call vmx_read_l1_tsc() directly. Actually you can even use
> to_vmx(vcpu)->nested.vmcs01_tsc_offset directly here since the function
> will be called only when is_guest_mode() == true, but vmx_read_l1_tsc()
> may be more clear here and compile should optimize second is_guest_mode()
> check anyway.

Right.  Though I'm not that concerned about optimizing L1->L2 and L0->L2
entries; I'm more concerned about not slowing down L0->L1 (which
includes the non-nested case, of course).

>>> +	preempt_val_l1 = delta_tsc_l1 >> preempt_scale;
>>> +	if (preempt_val_l2 <= preempt_val_l1)
>>> +		preempt_val_l2 = 0;
>>> +	else
>>> +		preempt_val_l2 -= preempt_val_l1;
>>> +	vmcs_write32(VMX_PREEMPTION_TIMER_VALUE, preempt_val_l2);
>>
>> Did you test that a value of 0 triggers an immediate exit, rather than
>> counting down by 2^32?  Perhaps it's safer to limit the value to 1
>> instead of 0.
>>
>>> +}
>>> +
>>>  /*
>>>   * The guest has exited.  See if we can fix it or if we need userspace
>>>   * assistance.
>>> @@ -6736,9 +6766,11 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu)
>>>  		vmx->nested.nested_run_pending = 0;
>>>  
>>>  	if (is_guest_mode(vcpu) && nested_vmx_exit_handled(vcpu)) {
>>> +		vmx->nested.nested_vmx_exit = true;
>>
>> I think this assignment should be in nested_vmx_vmexit, since it is
>> called from other places as well.
>>
>>>  		nested_vmx_vmexit(vcpu);
>>>  		return 1;
>>>  	}
>>> +	vmx->nested.nested_vmx_exit = false;
>>>  
>>>  	if (exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY) {
>>>  		vcpu->run->exit_reason = KVM_EXIT_FAIL_ENTRY;
>>> @@ -7132,6 +7164,8 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
>>>  	debugctlmsr = get_debugctlmsr();
>>>  
>>>  	vmx->__launched = vmx->loaded_vmcs->launched;
>>> +	if (is_guest_mode(vcpu) && !(vmx->nested.nested_vmx_exit))
>>> +		nested_adjust_preemption_timer(vcpu);
>>
>> Please leave the assignment to __launched last, since it's already
>> initializing the asm below.
>>
>> I don't like the is_guest_mode check here... Maybe it's
>> micro-optimizing, but I wonder if we already do too many checks in
>> vmx_vcpu_run...  For example, is_guest_mode could be changed (I think)
>> to a check for "vmx->loaded_vmcs == &vmx->vmcs1".
>>
> Why this will be more efficient that HF_GUEST_MASK check?

Because we have already loaded vmx->loaded_vmcs, so it's one memory
access less.

>> Alternatively, we could change nested_vmx_exit to an enum in struct
>> vcpu_vmx (with values for L0->L1, L0->L2, L1->L2) that is initialized in
>> vmx_handle_exit.  Then we could check directly for L0->L2 and not adjust
>> the preemption timer in other cases.  In fact, I suspect this enum could
>> replace HF_GUEST_MASK altogether.  However, this would require some
>> other, more complicated, changes to svm.c.
>>
>> Gleb, what do you think?
>>
> I do not see why nested_vmx_exit is necessary at all yet. We can detect
> all aforementioned cases without.

How do you distinguish L0->L2 from L1->L2 in vmx_vcpu_run?

Paolo

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html