On 15/03/2017 00:27, Marcelo Tosatti wrote: >> So, the question then is how to design the hypervisor so that these NFV >> virtual machines can play with cpufreq, but there are no adverse >> indefinite effects. > Ok, we can modify the cpufreq cgroups patch, to, from the hypercalls > set the: > > "The first three patches of this series introduces > capacity_{min,max} tracking > in the core scheduler, as an extension of the CPU controller." > > capacity_min == capacity_max values (which forces the CPU to run > at that frequency, given there are no other tasks requesting > frequency information on that CPU). > > This is good enough DPDK. So this sounds like a plan? >> One possibility is to have some kind of per-task >> cpufreq. Another is to do everything in userspace with virtual ACPI >> P-states and the userspace governor in the VM. > > Virtual ACPI P-state, that is an option. But why not make it > in-kernel, the exit to userspace can be a significant > fraction of the total if the frequency change time is small (say, 10us > freq change and 5us for userspace exit). The advantage of doing it in userspace is that the sysfs chmod is a clear way to say "this VM should have the privilege of setting cpufreq. In effect, userspace's file descriptor for the sysfs files represents the capability to set cpufreq for the VM. You can even pass the file descriptor with SCM_RIGHTS if you wish to do so. But of course that's only needed if the frequency change is global per physical CPU. if the CPU controller gains the ability to do per-task frequency switching, that's even better for KVM. Then the hypercalls are just fine and we can have a KVM-specific cpufreq controller. Paolo