On Fri, Feb 24, 2017 at 10:18:59AM +0100, Paolo Bonzini wrote: > > > On 24/02/2017 00:19, Marcelo Tosatti wrote: > >>> i.e. our feature implies userspace tasks pinned to isolated vCPUs. > > This is how cpufreq-userspace works: > > > > 2.2 Governor > > ------------ > > > > On all other cpufreq implementations, these boundaries still need to > > be set. Then, a "governor" must be selected. Such a "governor" decides > > what speed the processor shall run within the boundaries. One such > > "governor" is the "userspace" governor. This one allows the user - or > > a yet-to-implement userspace program - to decide what specific speed > > the processor shall run at. > > The userspace program sets a policy for the whole system. No, its per cpu. > >> That's bad. This feature is broken by design unless it does proper > >> save/restore across preemption. > > > > Whats the current usecase, or forseeable future usecase, for save/restore > > across preemption again? (which would validate the broken by design > > claim). > > Stop a guest that is using cpufreq, start a guest that is not using it. > The second guest's performance now depends on the state that the first > guest left in cpufreq. Nothing forbids the host to implement switching with the current hypercall interface: all you need is a scheduler hook. > I think this is abusing the userspace governor. Unfortunately cpufreq > governors cannot be stacked. > > Paolo This is a special usecase where only the app in the guest knows whats the most appropriate frequency at a given time. This is what cpufreq-userspace is supposed to allow userspace to do, but in this case "userspace" is the guest, so i don't see this as an abuse at all. Timeshared setups are by definition not deterministic: your task A could be interrupted by another task B with results similar to a lower frequency being set. So saying that: "Our frequency scaling interface goes against the idea -- guest kernel cannot schedule multiple userspaces on the same vCPU, because they could conflict by overriding frequency." Assumes that, in a timeshared system, an application is guaranteed a particular frequency. But that does not make sense: its a timeshared system in the first place, there is no determinism regarding execution time. Moreover, there is no notion of "per-task CPU frequency" in Linux (there could be, this whole governor business with user being responsible for setting up the governor is pretty sucky IMO).