Re: [PATCH] [RESEND] KVM:VMX: Add support for Pause-Loop Exiting

Marcelo Tosatti <mtosatti@xxxxxxxxxx> · Fri, 2 Oct 2009 15:28:40 -0300

On Wed, Sep 30, 2009 at 01:22:49PM -0300, Marcelo Tosatti wrote:
> On Wed, Sep 30, 2009 at 09:01:51AM +0800, Zhai, Edwin wrote:
> > Avi,
> > I modify it according your comments. The only thing I want to keep is  
> > the module param ple_gap/window.  Although they are not per-guest, they  
> > can be used to find the right value, and disable PLE for debug purpose.
> >
> > Thanks,
> >
> >
> > Avi Kivity wrote:
> >> On 09/28/2009 11:33 AM, Zhai, Edwin wrote:
> >>   
> >>> Avi Kivity wrote:
> >>>     
> >>>> +#define KVM_VMX_DEFAULT_PLE_GAP    41
> >>>> +#define KVM_VMX_DEFAULT_PLE_WINDOW 4096
> >>>> +static int __read_mostly ple_gap = KVM_VMX_DEFAULT_PLE_GAP;
> >>>> +module_param(ple_gap, int, S_IRUGO);
> >>>> +
> >>>> +static int __read_mostly ple_window = KVM_VMX_DEFAULT_PLE_WINDOW;
> >>>> +module_param(ple_window, int, S_IRUGO);
> >>>>
> >>>> Shouldn't be __read_mostly since they're read very rarely  
> >>>> (__read_mostly should be for variables that are very often read, 
> >>>> and rarely written).
> >>>>       
> >>> In general, they are read only except that experienced user may try  
> >>> different parameter for perf tuning.
> >>>     
> >>
> >>
> >> __read_mostly doesn't just mean it's read mostly.  It also means it's  
> >> read often.  Otherwise it's just wasting space in hot cachelines.
> >>
> >>   
> >>>> I'm not even sure they should be parameters.
> >>>>       
> >>> For different spinlock in different OS, and for different workloads,  
> >>> we need different parameter for tuning. It's similar as the 
> >>> enable_ept.
> >>>     
> >>
> >> No, global parameters don't work for tuning workloads and guests since  
> >> they cannot be modified on a per-guest basis.  enable_ept is only 
> >> useful for debugging and testing.
> >>
> >>   
> >>>>> +    set_current_state(TASK_INTERRUPTIBLE);
> >>>>> +    schedule_hrtimeout(&expires, HRTIMER_MODE_ABS);
> >>>>> +
> >>>>>         
> >>>> Please add a tracepoint for this (since it can cause significant  
> >>>> change in behaviour),       
> >>> Isn't trace_kvm_exit(exit_reason, ...) enough? We can tell the PLE  
> >>> vmexit from other vmexits.
> >>>     
> >>
> >> Right.  I thought of the software spinlock detector, but that's another 
> >> problem.
> >>
> >> I think you can drop the sleep_time parameter, it can be part of the  
> >> function.  Also kvm_vcpu_sleep() is confusing, we also sleep on halt.   
> >> Please call it kvm_vcpu_on_spin() or something (since that's what the  
> >> guest is doing).
> 
> kvm_vcpu_on_spin() should add the vcpu to vcpu->wq (so a new pending
> interrupt wakes it up immediately).

Updated version (also please send it separately from the vmx.c patch):

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 894a56e..43125dc 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -231,6 +231,7 @@ int kvm_is_visible_gfn(struct kvm *kvm, gfn_t gfn);
 void mark_page_dirty(struct kvm *kvm, gfn_t gfn);
 
 void kvm_vcpu_block(struct kvm_vcpu *vcpu);
+void kvm_vcpu_on_spin(struct kvm_vcpu *vcpu);
 void kvm_resched(struct kvm_vcpu *vcpu);
 void kvm_load_guest_fpu(struct kvm_vcpu *vcpu);
 void kvm_put_guest_fpu(struct kvm_vcpu *vcpu);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 4d0dd39..e788d70 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1479,6 +1479,21 @@ void kvm_resched(struct kvm_vcpu *vcpu)
 }
 EXPORT_SYMBOL_GPL(kvm_resched);
 
+void kvm_vcpu_on_spin(struct kvm_vcpu *vcpu)
+{
+	ktime_t expires;
+	DEFINE_WAIT(wait);
+
+	prepare_to_wait(&vcpu->wq, &wait, TASK_INTERRUPTIBLE);
+
+	/* Sleep for 100 us, and hope lock-holder got scheduled */
+	expires = ktime_add_ns(ktime_get(), 100000UL);
+	schedule_hrtimeout(&expires, HRTIMER_MODE_ABS);
+
+	finish_wait(&vcpu->wq, &wait);
+}
+EXPORT_SYMBOL_GPL(kvm_vcpu_on_spin);
+
 static int kvm_vcpu_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 {
 	struct kvm_vcpu *vcpu = vma->vm_file->private_data;
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html