On Wed, Sep 30, 2009 at 01:22:49PM -0300, Marcelo Tosatti wrote: > On Wed, Sep 30, 2009 at 09:01:51AM +0800, Zhai, Edwin wrote: > > Avi, > > I modify it according your comments. The only thing I want to keep is > > the module param ple_gap/window. Although they are not per-guest, they > > can be used to find the right value, and disable PLE for debug purpose. > > > > Thanks, > > > > > > Avi Kivity wrote: > >> On 09/28/2009 11:33 AM, Zhai, Edwin wrote: > >> > >>> Avi Kivity wrote: > >>> > >>>> +#define KVM_VMX_DEFAULT_PLE_GAP 41 > >>>> +#define KVM_VMX_DEFAULT_PLE_WINDOW 4096 > >>>> +static int __read_mostly ple_gap = KVM_VMX_DEFAULT_PLE_GAP; > >>>> +module_param(ple_gap, int, S_IRUGO); > >>>> + > >>>> +static int __read_mostly ple_window = KVM_VMX_DEFAULT_PLE_WINDOW; > >>>> +module_param(ple_window, int, S_IRUGO); > >>>> > >>>> Shouldn't be __read_mostly since they're read very rarely > >>>> (__read_mostly should be for variables that are very often read, > >>>> and rarely written). > >>>> > >>> In general, they are read only except that experienced user may try > >>> different parameter for perf tuning. > >>> > >> > >> > >> __read_mostly doesn't just mean it's read mostly. It also means it's > >> read often. Otherwise it's just wasting space in hot cachelines. > >> > >> > >>>> I'm not even sure they should be parameters. > >>>> > >>> For different spinlock in different OS, and for different workloads, > >>> we need different parameter for tuning. It's similar as the > >>> enable_ept. > >>> > >> > >> No, global parameters don't work for tuning workloads and guests since > >> they cannot be modified on a per-guest basis. enable_ept is only > >> useful for debugging and testing. > >> > >> > >>>>> + set_current_state(TASK_INTERRUPTIBLE); > >>>>> + schedule_hrtimeout(&expires, HRTIMER_MODE_ABS); > >>>>> + > >>>>> > >>>> Please add a tracepoint for this (since it can cause significant > >>>> change in behaviour), > >>> Isn't trace_kvm_exit(exit_reason, ...) enough? We can tell the PLE > >>> vmexit from other vmexits. > >>> > >> > >> Right. I thought of the software spinlock detector, but that's another > >> problem. > >> > >> I think you can drop the sleep_time parameter, it can be part of the > >> function. Also kvm_vcpu_sleep() is confusing, we also sleep on halt. > >> Please call it kvm_vcpu_on_spin() or something (since that's what the > >> guest is doing). > > kvm_vcpu_on_spin() should add the vcpu to vcpu->wq (so a new pending > interrupt wakes it up immediately). Updated version (also please send it separately from the vmx.c patch): diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 894a56e..43125dc 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -231,6 +231,7 @@ int kvm_is_visible_gfn(struct kvm *kvm, gfn_t gfn); void mark_page_dirty(struct kvm *kvm, gfn_t gfn); void kvm_vcpu_block(struct kvm_vcpu *vcpu); +void kvm_vcpu_on_spin(struct kvm_vcpu *vcpu); void kvm_resched(struct kvm_vcpu *vcpu); void kvm_load_guest_fpu(struct kvm_vcpu *vcpu); void kvm_put_guest_fpu(struct kvm_vcpu *vcpu); diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 4d0dd39..e788d70 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1479,6 +1479,21 @@ void kvm_resched(struct kvm_vcpu *vcpu) } EXPORT_SYMBOL_GPL(kvm_resched); +void kvm_vcpu_on_spin(struct kvm_vcpu *vcpu) +{ + ktime_t expires; + DEFINE_WAIT(wait); + + prepare_to_wait(&vcpu->wq, &wait, TASK_INTERRUPTIBLE); + + /* Sleep for 100 us, and hope lock-holder got scheduled */ + expires = ktime_add_ns(ktime_get(), 100000UL); + schedule_hrtimeout(&expires, HRTIMER_MODE_ABS); + + finish_wait(&vcpu->wq, &wait); +} +EXPORT_SYMBOL_GPL(kvm_vcpu_on_spin); + static int kvm_vcpu_fault(struct vm_area_struct *vma, struct vm_fault *vmf) { struct kvm_vcpu *vcpu = vma->vm_file->private_data; -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html