On 09/24/2012 08:24 AM, Takuya Yoshikawa wrote: > This is an RFC since I have not done any comparison with the approach > using for_each_set_bit() which can be seen in Avi's work. > > Takuya > --- > > We did a simple test to see which requests we would get at the same time > in vcpu_enter_guest() and got the following numbers: > > |...........|...............|........|...............|.| > (N) (E) (S) (ES) others > 22.3% 30.7% 16.0% 29.5% 1.4% > > (N) : Nothing > (E) : Only KVM_REQ_EVENT > (S) : Only KVM_REQ_STEAL_UPDATE > (ES): Only KVM_REQ_EVENT and KVM_REQ_STEAL_UPDATE > > * Note that the exact numbers can change for other guests. What was the workload? STEAL_UPDATE is done on schedules and heavyweight exit (erronously), so it should be rare. Or maybe we're recording HLT time as steal time? > > This motivated us to optimize the following code in vcpu_enter_guest(): > > if (vcpu->requests) { /** (1) **/ > ... > if (kvm_check_request(KVM_REQ_STEAL_UPDATE, vcpu)) /** (2) **/ > record_steal_time(vcpu); > ... > } > > if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) { > ... > } > > - For case (E), we do kvm_check_request() for every request other than > KVM_REQ_EVENT in the block (1) and always get false. > - For case (S) and (ES), the only difference from case (E) is that we > get true for KVM_REQ_STEAL_UPDATE. > > This means that we were wasting a lot of time for the many if branches > in the block (1) even for these trivial three cases which dominated more > than 75% in total. > > This patch mitigates the issue as follows: > - For case (E), we change the first if condition to > if (vcpu->requests & ~(1 << KVM_REQ_EVENT)) /** (1') **/ > so that we can skip the block completely. > - For case (S) and (ES), we move the check (2) upwards, out of the > block (1), to clear the KVM_REQ_STEAL_UPDATE flag before doing the > check (1'). > > Although this adds one if branch for case (N), the fact that steal > update occurs frequently enough except when we give each vcpu a > dedicated core justifies its tiny cost. Modern processors will eliminate KVM_REQ_EVENT in many cases, so the optmimization is wasted on them. Do you have numbers? Just for your patch, not my alternative. > > Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@xxxxxxxxxxxxx> > --- > [My email address change is not a mistake.] > > arch/x86/kvm/x86.c | 11 ++++++++--- > 1 files changed, 8 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index fc2a0a1..e351902 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -5233,7 +5233,14 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) > vcpu->run->request_interrupt_window; > bool req_immediate_exit = 0; > > - if (vcpu->requests) { > + /* > + * Getting KVM_REQ_STEAL_UPDATE alone is so common that clearing it now > + * will hopefully result in skipping the following checks. > + */ > + if (kvm_check_request(KVM_REQ_STEAL_UPDATE, vcpu)) > + record_steal_time(vcpu); > + > + if (vcpu->requests & ~(1 << KVM_REQ_EVENT)) { > if (kvm_check_request(KVM_REQ_MMU_RELOAD, vcpu)) > kvm_mmu_unload(vcpu); > if (kvm_check_request(KVM_REQ_MIGRATE_TIMER, vcpu)) > @@ -5267,8 +5274,6 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) > r = 1; > goto out; > } > - if (kvm_check_request(KVM_REQ_STEAL_UPDATE, vcpu)) > - record_steal_time(vcpu); > if (kvm_check_request(KVM_REQ_NMI, vcpu)) > process_nmi(vcpu); > req_immediate_exit = > -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html