Re: [PATCH] KVM: add kvm_arch_cpu_kick

Christian Borntraeger <borntraeger@xxxxxxxxxx> · Tue, 21 Feb 2017 09:59:25 +0100

On 02/20/2017 10:45 PM, Radim Krčmář wrote:
> 2017-02-20 12:35+0100, David Hildenbrand:
>> Am 20.02.2017 um 12:12 schrieb Christian Borntraeger:
>>> On 02/17/2017 06:10 PM, David Hildenbrand wrote:
>>>>
>>>>>> Yes, it would.  There's some parallel with QEMU's qemu_cpu_kick, where
>>>>>> the signal would be processed immediately after entering KVM_RUN.
>>>>>
>>>>> Something like 
>>>>>
>>>>> ---snip-----
>>>>>         struct kvm_s390_sie_block *scb = READ_ONCE(vcpu->arch.vsie_block);
>>>>>
>>>>> 	atomic_or(CPUSTAT_STOP_INT, &vcpu->arch.sie_block->cpuflags);
>>>>>         if (scb)
>>>>> 		atomic_or(CPUSTAT_STOP_INT, &scb->cpuflags);
>>>>> ---snip-----
>>>>>
>>>>> or 
>>>>> ---snip-----
>>>>> 	atomic_or(CPUSTAT_STOP_INT, &vcpu->arch.sie_block->cpuflags);
>>>>> 	kvm_s390_vsie_kick(vcpu);
>>>>> ---snip-----
>>>>
>>>> I'd go for the latter one. Keep the vsie stuff isolated. Please note
>>>
>>> Yes makes sense.
>>>
>>> Radim, if you go with this patch something like this can be used as the
>>> s390 variant of kvm_arch_cpu_kick:
>>>
>>> ---snip---
>>> 	/*
>>> 	 * The stop indication is reset in the interrupt code. As the CPU
>>> 	 * loop handles requests after interrupts, we will
>>> 	 * a: miss the request handler and enter the guest, but then the
>>> 	 * stop request will exit the CPU and handle the request in the next
>>> 	 * round or
>>> 	 * b: handle the request directly before entering the guest
>>> 	 */
>>> 	atomic_or(CPUSTAT_STOP_INT, &vcpu->arch.sie_block->cpuflags);
>>> 	kvm_s390_vsie_kick(vcpu);
>>>
>>> ---snip---
>>> feel free to add that to your patch. I can also send a fixup patch later
>>> on if you prefer that.
>>
>> kvm_arch_vcpu_should_kick() then also has to be changed to return 1 for now.
>>
>> An interesting thing to note is how vcpu->cpu is used.
>>
>> Again, as s390x can preempt just before entering the guest, vcpu_kick()
>> might see vcpu->cpu = -1. Therefore, kvm_arch_vcpu_should_kick() won't
>> even be called. So our cpu might go into guest mode and stay there
>> longer than expected (as we won't kick it).
>>
>> On x86, it is the following way:
>>
>> If vcpu->cpu is -1, no need to kick the VCPU. It will check for requests
>> when preemption is disabled, therefore when rescheduled.
>>
>> If vcpu->cpu is set, kvm_arch_vcpu_should_kick() remembers if the VCPU
>> has already been kicked while in the critical section. It will get
>> kicked by smp resched as soon as entering guest mode.
>>
>> So here, disabled preemption + checks in the section with disabled
>> preemption (for requests and EXITING_GUEST_MODE) make sure that the
>> guest will leave guest mode and process requests in a timely fashion.
>>
>> On s390x, this is not 100% true. vcpu->cpu cannot be used as an
>> indicator whether a kick is necessary. Either that is ok for now, or the
>> vcpu->cpu != -1 check has to be disabled for s390x, e.g. by moving the
>> check into kvm_arch_vcpu_should_kick().
> 
> Good point.
> 
> So s390 doesn't need vcpu->cpu and only sets it because other arches do?

David added it as a sanity check for cpu time accounting while in host. But
we do not need it, yes.


> And do I understand it correctly that the s390 SIE block operations have
> no side-effect, apart from changed memory, when outside of guest-mode?

You mean accesses to the sie control block vcpu->arch.sie_block? Yes its just
changed memory as long as the VCPU that is backed by this sie control block
is not running.


> (We have cpu->mode mostly because interrupts are expensive. :])
> 
> In the end, I'd like to use kvm_vcpu_kick() for kvm_s390_vcpu_wakeup().

something like this?

--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -1067,15 +1067,7 @@ void kvm_s390_vcpu_wakeup(struct kvm_vcpu *vcpu)
         * in kvm_vcpu_block without having the waitqueue set (polling)
         */
        vcpu->valid_wakeup = true;
-       if (swait_active(&vcpu->wq)) {
-               /*
-                * The vcpu gave up the cpu voluntarily, mark it as a good
-                * yield-candidate.
-                */
-               vcpu->preempted = true;
-               swake_up(&vcpu->wq);
-               vcpu->stat.halt_wakeup++;
-       }
+       kvm_vcpu_wake_up(vcpu);
        /*
         * The VCPU might not be sleeping but is executing the VSIE. Let's
         * kick it, so it leaves the SIE to process the request.
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2213,6 +2213,7 @@ void kvm_vcpu_wake_up(struct kvm_vcpu *vcpu)
        wqp = kvm_arch_vcpu_wq(vcpu);
        if (swait_active(wqp)) {
                swake_up(wqp);
+               vcpu->preempted = true;
                ++vcpu->stat.halt_wakeup;
        }
 


> s390 sets vcpu->preempted to get a performance boost, which makes
> touching it less than desirable ...
> On s390, vcpu->preempted is only used in __diag_time_slice_end(), which
> seems to be a type of spinning-on-a-taken-lock hypercall -- any reason
> why that optimization shouldn't work on other architectures?

We set preempted in kvm_s390_vcpu_wakeup because otherwise a cpu that sleeps
(halted) would not be considered as a good candidate in kvm_vcpu_on_spin,
even if we just decided to wakeup that CPU for an interrupt.

Yes, it certainly makes sense to do that in kvm_vcpu_wake_up as well.