Re: [PATCH v2] KVM: LAPIC: Fix cancel preemption timer repeatedly due to preemption

Wanpeng Li <kernellwp@xxxxxxxxx> · Mon, 24 Jul 2017 23:08:44 +0800

2017-07-24 22:45 GMT+08:00 Paolo Bonzini <pbonzini@xxxxxxxxxx>:
> On 24/07/2017 10:57, Wanpeng Li wrote:
>> From: Wanpeng Li <wanpeng.li@xxxxxxxxxxx>
>>
>> Preemption can occur in the preemption timer expiration handler:
>>
>>           CPU0                    CPU1
>>
>>   preemption timer vmexit
>>   handle_preemption_timer(vCPU0)
>>     kvm_lapic_expired_hv_timer
>>       hv_timer_is_use == true
>>   sched_out
>>                            sched_in
>>                            kvm_arch_vcpu_load
>>                              kvm_lapic_restart_hv_timer
>>                                restart_apic_timer
>>                                  start_hv_timer
>>                                    already-expired timer or sw timer triggerd in the window
>>                                  start_sw_timer
>>                                    cancel_hv_timer
>
> At this point, the timer interrupt is injected, right?

Do you mean the new one on CPU1? I think we just set the pending
timer, we return back to kvm_lapic_expired_hv_timer() after preempt
notifier sched_in.

>
> If this is correct, kvm_lapic_expired_hv_timer can just do nothing if
> the timer is not in use, with a comment explaining that the preemption
> notifier has run start_sw_timer and thus injected the timer interrupt.
>
>>                            /* back in kvm_lapic_expired_hv_timer */
>>                            cancel_hv_timer
>>                              WARN_ON(!apic->lapic_timer.hv_timer_in_use);  ==> Oops
>>
>> This can be reproduced if CONFIG_PREEMPT is enabled.
>>
>> This patch fixes it by don't cancel preemption timer repeatedly if
>> the preemption timer has already been cancelled due to preemption
>> since already-expired timer or sw timer triggered in the window.
>>
>> Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>
>> Cc: Radim Krčmář <rkrcmar@xxxxxxxxxx>
>> Signed-off-by: Wanpeng Li <wanpeng.li@xxxxxxxxxxx>
>> ---
>>  arch/x86/kvm/lapic.c | 10 +++++++---
>>  1 file changed, 7 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> index 2819d4c..8341b40 100644
>> --- a/arch/x86/kvm/lapic.c
>> +++ b/arch/x86/kvm/lapic.c
>> @@ -1560,9 +1560,13 @@ void kvm_lapic_expired_hv_timer(struct kvm_vcpu *vcpu)
>>  {
>>       struct kvm_lapic *apic = vcpu->arch.apic;
>>
>> -     WARN_ON(!apic->lapic_timer.hv_timer_in_use);
>> -     WARN_ON(swait_active(&vcpu->wq));
>> -     cancel_hv_timer(apic);
>> +     preempt_disable();
>> +     if (!(!apic_lvtt_period(apic) && atomic_read(&apic->lapic_timer.pending))) {
>
> Why is the "if" necessary?
>
> Maybe all of kvm_lapic_expired_hv_timer and start_sw_timer should be in
> preemption-disabled regions, which trivially avoids any reentrancy issue
> with the preempt notifier.  Then, cancel_hv_timer can assert that it's
> called with preemption disabled.

For example:

static int handle_preemption_timer(struct kvm_vcpu *vcpu)
{
     --------------------------------------------------> We still can
be preempted here, and do one cancel_hv_timer()
    preempt_disable();
    kvm_lapic_expired_hv_timer(vcpu);   -----> WARN_ON in
cancel_hv_timer() even if we remove the WARN_ON in
kvm_lapic_expired_hv_timer() as you mentioned above
    preempt_enable();
    return 1;
}

Regards,
Wanpeng Li