On Wed, May 12, 2021, Wanpeng Li wrote: > On Wed, 12 May 2021 at 05:44, Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > > > On Sat, May 08, 2021, Wanpeng Li wrote: > > > From: Wanpeng Li <wanpengli@xxxxxxxxxxx> > > > > > > In case of undercomitted scenarios, vCPU can get scheduling easily, > > > kvm_vcpu_yield_to adds extra overhead, we can observe a lot of race > > > between vcpu->ready is true and yield fails due to p->state is > > > TASK_RUNNING. Let's bail out is such scenarios by checking the length > > > of current cpu runqueue. > > > > > > Signed-off-by: Wanpeng Li <wanpengli@xxxxxxxxxxx> > > > --- > > > arch/x86/kvm/x86.c | 3 +++ > > > 1 file changed, 3 insertions(+) > > > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > > index 5bd550e..c0244a6 100644 > > > --- a/arch/x86/kvm/x86.c > > > +++ b/arch/x86/kvm/x86.c > > > @@ -8358,6 +8358,9 @@ static void kvm_sched_yield(struct kvm_vcpu *vcpu, unsigned long dest_id) > > > struct kvm_vcpu *target = NULL; > > > struct kvm_apic_map *map; > > > > > > + if (single_task_running()) > > > + goto no_yield; > > > + > > > > Hmm, could we push the result of kvm_sched_yield() down into the guest? > > Currently the guest bails after the first attempt, which is perfect for this > > scenario, but it seems like it would make sense to keep trying to yield if there > > are multiple preempted vCPUs and > > It can have a race in case of sustain yield if there are multiple > preempted vCPUs , the vCPU which you intend to yield may have already > completed to handle IPI and be preempted now when the yielded sender > is scheduled again and checks the next preempted candidate. Ah, right, don't want to penalize the happy case. > > Unrelated to this patch, but it's the first time I've really looked at the guest > > side of directed yield... > > > > Wouldn't it also make sense for the guest side to hook .send_call_func_single_ipi? > > reschedule ipi is called by .smp_send_reschedule hook, there are a lot > of researches intend to accelerate idle vCPU reactivation, my original > attemption is to boost synchronization primitive, I believe we need a > lot of benchmarkings to consider inter-VM fairness and performance > benefit for hooks .send_call_func_single_ipi and > .smp_send_reschedule. I was thinking of the 2 vCPU case. If the VM has 2 vCPUs, then this /* * Choose the most efficient way to send an IPI. Note that the * number of CPUs might be zero due to concurrent changes to the * provided mask. */ if (nr_cpus == 1) send_call_function_single_ipi(last_cpu); else if (likely(nr_cpus > 1)) arch_send_call_function_ipi_mask(cfd->cpumask_ipi); means .send_call_func_single_ipi() will always be used to send an IPI to the other vCPU, and thus 2 vCPU VMs will never utilize PV yield.