On Wed, Mar 15, 2017 at 09:23:10AM +0100, Paolo Bonzini wrote: > > > On 15/03/2017 00:27, Marcelo Tosatti wrote: > >> So, the question then is how to design the hypervisor so that these NFV > >> virtual machines can play with cpufreq, but there are no adverse > >> indefinite effects. > > Ok, we can modify the cpufreq cgroups patch, to, from the hypercalls > > set the: > > > > "The first three patches of this series introduces > > capacity_{min,max} tracking > > in the core scheduler, as an extension of the CPU controller." > > > > capacity_min == capacity_max values (which forces the CPU to run > > at that frequency, given there are no other tasks requesting > > frequency information on that CPU). > > > > This is good enough DPDK. > > So this sounds like a plan? Yes, trying that now... > > >> One possibility is to have some kind of per-task > >> cpufreq. Another is to do everything in userspace with virtual ACPI > >> P-states and the userspace governor in the VM. > > > > Virtual ACPI P-state, that is an option. But why not make it > > in-kernel, the exit to userspace can be a significant > > fraction of the total if the frequency change time is small (say, 10us > > freq change and 5us for userspace exit). > > The advantage of doing it in userspace is that the sysfs chmod is a > clear way to say "this VM should have the privilege of setting cpufreq. > In effect, userspace's file descriptor for the sysfs files represents > the capability to set cpufreq for the VM. You can even pass the file > descriptor with SCM_RIGHTS if you wish to do so. > > But of course that's only needed if the frequency change is global per > physical CPU. if the CPU controller gains the ability to do per-task > frequency switching, that's even better for KVM. Then the hypercalls > are just fine and we can have a KVM-specific cpufreq controller. > > Paolo I see, thanks.