Re: [PATCH 0/2] x86/idle: add halt poll support

Yang Zhang <yang.zhang.wz@xxxxxxxxx> · Fri, 23 Jun 2017 14:49:04 +0800

On 2017/6/23 12:35, Wanpeng Li wrote:
2017-06-23 12:08 GMT+08:00 Yang Zhang <yang.zhang.wz@xxxxxxxxx>:
On 2017/6/22 19:50, Wanpeng Li wrote:

2017-06-22 19:22 GMT+08:00 root <yang.zhang.wz@xxxxxxxxx>:

From: Yang Zhang <yang.zhang.wz@xxxxxxxxx>

Some latency-intensive workload will see obviously performance
drop when running inside VM. The main reason is that the overhead
is amplified when running inside VM. The most cost i have seen is
inside idle path.
This patch introduces a new mechanism to poll for a while before
entering idle state. If schedule is needed during poll, then we
don't need to goes through the heavy overhead path.

Here is the data i get when running benchmark contextswitch
(https://github.com/tsuna/contextswitch)
before patch:
2000000 process context switches in 4822613801ns (2411.3ns/ctxsw)
after patch:
2000000 process context switches in 3584098241ns (1792.0ns/ctxsw)

If you test this after disabling the adaptive halt-polling in kvm?
What's the performance data of w/ this patchset and w/o the adaptive
halt-polling in kvm, and w/o this patchset and w/ the adaptive
halt-polling in kvm? In addition, both linux and windows guests can
get benefit as we have already done this in kvm.

I will provide more data in next version. But it doesn't conflict with

Another case I can think of is w/ both this patchset and the adaptive
halt-polling in kvm.

current halt polling inside kvm. This is just another enhancement.

I didn't look close to the patchset, however, maybe there is another
poll in the kvm part again sometimes if you fails the poll in the
guest. In addition, the adaptive halt-polling in kvm has performance
penalty when the pCPU is heavily overcommitted though there is a
single_task_running() in my testing, it is hard to accurately aware
whether there are other tasks waiting on the pCPU in the guest which
will make it worser. Depending on vcpu_is_preempted() or steal time
maybe not accurately or directly.

So I'm not sure how much sense it makes by adaptive halt-polling in
both guest and kvm. I prefer to just keep adaptive halt-polling in
kvm(then both linux/windows or other guests can get benefit) and avoid
to churn the core x86 path.

This mechanism is not specific to KVM. It is a kernel feature which can 
benefit guest when running inside X86 virtualization environment. The 
guest includes KVM,Xen,VMWARE,Hyper-v. Administrator can control KVM to 
use adaptive halt poll but he cannot control the user to use halt 
polling inside guest. Lots of user set idle=poll inside guest to improve 
performance which occupy more CPU cycles. This mechanism is a 
enhancement to it not to KVM halt polling.

Regards,
Wanpeng Li

--
Yang
Alibaba Cloud Computing