Re: [RFC PATCH] KVM: x86: Skip request checking branches in vcpu_enter_guest() more effectively

Avi Kivity <avi@xxxxxxxxxx> · Mon, 24 Sep 2012 12:18:15 +0200

On 09/24/2012 08:24 AM, Takuya Yoshikawa wrote:
> This is an RFC since I have not done any comparison with the approach
> using for_each_set_bit() which can be seen in Avi's work.
> 
> 	Takuya
> ---
> 
> We did a simple test to see which requests we would get at the same time
> in vcpu_enter_guest() and got the following numbers:
> 
> |...........|...............|........|...............|.|
>      (N)           (E)         (S)          (ES)      others
>      22.3%         30.7%       16.0%        29.5%     1.4%
> 
> (N) : Nothing
> (E) : Only KVM_REQ_EVENT
> (S) : Only KVM_REQ_STEAL_UPDATE
> (ES): Only KVM_REQ_EVENT and KVM_REQ_STEAL_UPDATE
> 
> * Note that the exact numbers can change for other guests.

What was the workload?  STEAL_UPDATE is done on schedules and
heavyweight exit (erronously), so it should be rare.

Or maybe we're recording HLT time as steal time?

> 
> This motivated us to optimize the following code in vcpu_enter_guest():
> 
>     if (vcpu->requests) { /** (1) **/
>         ...
>         if (kvm_check_request(KVM_REQ_STEAL_UPDATE, vcpu)) /** (2) **/
>             record_steal_time(vcpu);
>         ...
>     }
> 
>     if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
>         ...
>     }
> 
>  - For case (E), we do kvm_check_request() for every request other than
>    KVM_REQ_EVENT in the block (1) and always get false.
>  - For case (S) and (ES), the only difference from case (E) is that we
>    get true for KVM_REQ_STEAL_UPDATE.
> 
> This means that we were wasting a lot of time for the many if branches
> in the block (1) even for these trivial three cases which dominated more
> than 75% in total.
> 
> This patch mitigates the issue as follows:
>  - For case (E), we change the first if condition to
>        if (vcpu->requests & ~(1 << KVM_REQ_EVENT)) /** (1') **/
>    so that we can skip the block completely.
>  - For case (S) and (ES), we move the check (2) upwards, out of the
>    block (1), to clear the KVM_REQ_STEAL_UPDATE flag before doing the
>    check (1').
> 
> Although this adds one if branch for case (N), the fact that steal
> update occurs frequently enough except when we give each vcpu a
> dedicated core justifies its tiny cost.

Modern processors will eliminate KVM_REQ_EVENT in many cases, so the
optmimization is wasted on them.

Do you have numbers?  Just for your patch, not my alternative.

> 
> Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@xxxxxxxxxxxxx>
> ---
>  [My email address change is not a mistake.]
> 
>  arch/x86/kvm/x86.c |   11 ++++++++---
>  1 files changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index fc2a0a1..e351902 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -5233,7 +5233,14 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  		vcpu->run->request_interrupt_window;
>  	bool req_immediate_exit = 0;
>  
> -	if (vcpu->requests) {
> +	/*
> +	 * Getting KVM_REQ_STEAL_UPDATE alone is so common that clearing it now
> +	 * will hopefully result in skipping the following checks.
> +	 */
> +	if (kvm_check_request(KVM_REQ_STEAL_UPDATE, vcpu))
> +		record_steal_time(vcpu);
> +
> +	if (vcpu->requests & ~(1 << KVM_REQ_EVENT)) {
>  		if (kvm_check_request(KVM_REQ_MMU_RELOAD, vcpu))
>  			kvm_mmu_unload(vcpu);
>  		if (kvm_check_request(KVM_REQ_MIGRATE_TIMER, vcpu))
> @@ -5267,8 +5274,6 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  			r = 1;
>  			goto out;
>  		}
> -		if (kvm_check_request(KVM_REQ_STEAL_UPDATE, vcpu))
> -			record_steal_time(vcpu);
>  		if (kvm_check_request(KVM_REQ_NMI, vcpu))
>  			process_nmi(vcpu);
>  		req_immediate_exit =
> 

-- 
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html