2017-02-02 15:47-0200, Marcelo Tosatti: > Implement min/max/up/down frequency change > KVM hypercalls. To be used by DPDK implementation. > > Also allow such hypercalls from guest userspace. > > Signed-off-by: Marcelo Tosatti <mtosatti@xxxxxxxxxx> > > --- > Index: kvm-pvfreq/arch/x86/kvm/x86.c > =================================================================== > --- kvm-pvfreq.orig/arch/x86/kvm/x86.c 2017-02-02 11:17:17.063756725 -0200 > +++ kvm-pvfreq/arch/x86/kvm/x86.c 2017-02-02 11:17:17.822752510 -0200 > @@ -6219,10 +6219,58 @@ [Here lived copy-paste.] > int kvm_emulate_hypercall(struct kvm_vcpu *vcpu) > { > unsigned long nr, a0, a1, a2, a3, ret; > int op_64_bit, r; > + bool cpl_check; > > r = kvm_skip_emulated_instruction(vcpu); > > @@ -6246,7 +6294,13 @@ > a3 &= 0xFFFFFFFF; > } > > - if (kvm_x86_ops->get_cpl(vcpu) != 0) { > + cpl_check = true; > + if (nr == KVM_HC_FREQ_UP || nr == KVM_HC_FREQ_DOWN || > + nr == KVM_HC_FREQ_MIN || nr == KVM_HC_FREQ_MAX) > + if (vcpu->arch.allow_freq_hypercall == true) > + cpl_check = false; > + > + if (cpl_check == true && kvm_x86_ops->get_cpl(vcpu) != 0) { > ret = -KVM_EPERM; > goto out; > } > @@ -6262,6 +6316,21 @@ > case KVM_HC_CLOCK_PAIRING: > ret = kvm_pv_clock_pairing(vcpu, a0, a1); > break; > +#ifdef CONFIG_CPU_FREQ_GOV_USERSPACE CONFIG_CPU_FREQ_GOV_USERSPACE should be checked when enabling the capability. > + case KVM_HC_FREQ_UP: > + ret = kvm_pvfreq_up(vcpu); > + break; > + case KVM_HC_FREQ_DOWN: > + ret = kvm_pvfreq_down(vcpu); > + break; > + case KVM_HC_FREQ_MAX: > + ret = kvm_pvfreq_max(vcpu); > + break; > + case KVM_HC_FREQ_MIN: > + ret = kvm_pvfreq_min(vcpu); > + break; Having 4 hypercalls for this is an overkill. You can make it one hypercall with an argument. And the argument doesn't have to be enum {UP, DOWN, MAX, MIN}, but an int, which would also allow you to do -2 steps. A number over the capabilites of stepping would just map to MAX/MIN. Avoiding an absolute scale for interface simplifies migration, where the guest cannot really depend much on this. Except that calling it with MIN (INT_MIN) will get the minimum and MAX (INT_MAX) the maximum frequency. Plese explictly say in documentation that things like the number of steps, which the guest can learn by doing MAX and then -1 until the hypercall fails, is undefined and should not be depended upon. Userspace might still want know the number of steps to avoid useless hypercall -- I think we should return a different value when the limit is reached, not just after the guest wants to go past it. > +#endif > + > default: > ret = -KVM_ENOSYS; > break; And thinking more about migration, userspace cannot learn the current frequency (at least MIN/MAX), so the new host will just pick at random, which will break userspace's expectations that it cannot increase or decrease the frequency. Is migration left for the future, because DPDK doesn't migrate anyway? Thanks.